EletiofeGoogle DeepMind’s Game-Playing AI Tackles a Chatbot Blind Spot

Google DeepMind’s Game-Playing AI Tackles a Chatbot Blind Spot

-

- Advertisment -

Several years before ChatGPT began jibber-jabbering away, Google developed a very different kind of artificial intelligence program called AlphaGo that learned to play the board game Go with superhuman skill through tireless practice.

Researchers at the company have now published research that combines the abilities of a large language model (the AI behind today’s chatbots) with those of AlphaZero, a successor to AlphaGo also capable of playing chess, to solve very tricky mathematical proofs.

Their new Frankensteinian creation, dubbed AlphaProof, has demonstrated its prowess by tackling several problems from the 2024 International Math Olympiad (IMO), a prestigious competition for high school students.

AlphaProof uses the Gemini large language model to convert naturally phrased math questions into a programming language called Lean. This provides the training fodder for a second algorithm to learn, through trial and error, how to find proofs that can be confirmed as correct.

Earlier this year, Google DeepMind revealed another math algorithm called AlphaGeometry that also combines a language model with a different AI approach. AlphaGeometry uses Gemini to convert geometry problems into a form that can be manipulated and tested by a program that handles geometric elements. Google today also announced a new and improved version of AlphaGeometry.

The researchers found that their two math programs could provide proofs for IMO puzzles as well as a silver medalist could. The programs solved two algebra problems and one number theory problem out of six in total. It got one problem in minutes but took up to several days to figure out others. Google DeepMind has not disclosed how much computer power it threw at the problems.

Google DeepMind calls the approach used for both AlphaProof and AlphaGeometry “neuro-symbolic” because they combine the pure machine learning of an artificial neural network, the technology that underpins most progress in AI of late, with the language of conventional programming.

“What we’ve seen here is that you can combine the approach that was so successful, and things like AlphaGo, with large language models and produce something that is extremely capable,” says David Silver, the Google DeepMind researcher who led work on AlphaZero. Silver says the techniques demonstrated with AlphaProof should, in theory, extend to other areas of mathematics.

Indeed, the research raises the prospect of addressing the worst tendencies of large language models by applying logic and reasoning in a more grounded fashion. As miraculous as large language models can be, they often struggle to grasp even basic math or to reason through problems logically.

In the future, the neural-symbolic method could provide a means for AI systems to turn questions or tasks into a form that can be reasoned over in a way that produces reliable results. OpenAI is also rumored to be working on such a system, codenamed “Strawberry.”

There is, however, a key limitation with the systems revealed today, as Silver acknowledges. Math solutions are either correct or incorrect, allowing AlphaProof and AlphaGeometry to work their way toward the right answer. Many real-world problems—coming up with the ideal itinerary for a trip, for instance—have many possible solutions, and which one is ideal may be unclear. Silver says the solution for more ambiguous questions may be for a language model to try to determine what constitutes a “right” answer during training. “There’s a spectrum of different things that can be tried,” he says.

Silver is also careful to note that Google DeepMind won’t be putting human mathematicians out of jobs. “We are aiming to provide a system that can prove anything, but that’s not the end of what mathematicians do,” he says. “A big part of mathematics is to pose problems and find what are the interesting questions to ask. You might think of this as another tool along the lines of a slide rule or calculator or computational tools.”

Latest news

Why Wear Anything Other Than a Sun Hoodie This Summer? Our Picks for the Best

I grew up in the late 1900s, in a time when attitudes toward sunburns were extremely lax compared to...

This Is the Most Detailed Image Yet of the Milky Way’s Center

The European Space Agency’s (ESA) Euclid space telescope has captured the largest and most detailed visible-light image ever obtained...

The Ebike Accessories You Need to Help You Haul the Most Stuff

When my wife and I bought our first ebike—a Radwagon 4 by the Seattle-based Rad Power Bikes—four years ago,...

China Defies US Restrictions and Builds the World’s Fastest Supercomputer

China now has the world's fastest supercomputer, overtaking the United States. The system, known as LineShine and installed at...
- Advertisement -

World Cup Round of 32: Knockout rounds begin with Canada taking on South Africa

The 2026 World Cup continues on Sunday as the Round of 32 begins and the remaining teams involved have...

2026 World Cup: Iran falls 1 spot short of knockout round after wild Algeria-Austria finish

One day after a heartbreaking draw threatened to end Iran's World Cup run, Algeria and Austria delivered the final...

Must read

This Is the Most Detailed Image Yet of the Milky Way’s Center

The European Space Agency’s (ESA) Euclid space telescope has...
- Advertisement -

You might also likeRELATED
Recommended to you