Out-calculating humans for decades (centuries if you count non-digital calculators)
Out-chessing humans since 1997
Out-geometrying high schoolers since ~2023
I mean geometry/trig have some of the simplest, most-straightforward, least ambiguous rulesets of any math. Why wouldn’t a computer outperform a human?
Geometry is a bit tricky. A lot of “obvious” facts about geometry are less obvious to prove from a given collection of axioms forming a model of geometry, because their “obviousness” stems from our natural facilities for understanding space and position. Sometimes, historically, things that are “obviously” true in geometry turn out to be false, or depend on unwritten assumptions, for complex reasons. It may be surprising in this light if current AI can beat humans’ intuition plus logic using purely analytic tools.
From the article:
For many years, we’ve had software that can generate lists of valid conclusions that can be drawn from a set of starting assumptions. Simple geometry problems can be solved by “brute force”: mechanically listing every possible fact that can be inferred from the given assumption, then listing every possible inference from those facts, and so on until you reach the desired conclusion.
But this kind of brute-force search isn’t feasible for an IMO-level geometry problem because the search space is too large. Not only do harder problems require longer proofs, but sophisticated proofs often require the introduction of new elements to the initial figure—as with point D in the above proof. Once you allow for these kinds of “auxiliary points,” the space of possible proofs explodes and brute-force methods become impractical.
So, mathematicians must develop an intuition about which proof steps will likely lead to a successful result. DeepMind’s breakthrough was to use a language model to provide the same kind of intuitive guidance to an automated search process.
Out-going humans since 2015
As an introvert, I’m sure the computers are more outgoing than me too 😪
Here’s the summary for the wikipedia article you mentioned in your comment:
AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google (now Alphabet Inc. ). Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games.
I like that high schoolers are the epitome of human geometry knowledge for comparison. It’s all downhill from there, you’ll never use most of that crap again kids.
Yeah but I bet AI sucks at smoking weed behind the bleachers or getting cheerleaders pregnant.
I wouldn’t be so sure…
If those kids all sucked, there wouldn’t be any teen pregnancies.
Well, you may have a point.
This is the best summary I could come up with:
“Because language models excel at identifying general patterns and relationships in data, they can quickly predict potentially useful constructs, but often lack the ability to reason rigorously or explain their decisions,” DeepMind writes.
To overcome this difficulty, DeepMind paired a language model with a more traditional symbolic deduction engine that performs algebraic and geometric reasoning.
The research was led by Trieu Trinh, a computer scientist who recently earned his PhD from New York University.
Evan Chen, a former Olympiad gold medalist who evaluated some of AlphaGeometry’s output, praised it as “impressive because it’s both verifiable and clean.” Whereas some earlier software generated complex geometry proofs that were hard for human reviewers to understand, the output of AlphaGeometry is similar to what a human mathematician would write.
AlphaGeometry is part of DeepMind’s larger project to improve the reasoning capabilities of large language models by combining them with traditional search algorithms.
For many years, we’ve had software that can generate lists of valid conclusions that can be drawn from a set of starting assumptions.
The original article contains 553 words, the summary contains 172 words. Saved 69%. I’m a bot and I’m open source!
geometry is fucking stupid it still can’t suck my dick