AI achieves silver-medal standard solving International Mathematical Olympiad problems (deepmind.google)

submitted 1 month ago by morrowind@lemmy.ml to c/technology@lemmy.world

9 comments fedilink hide all child comments

Today, we present AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of our geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

top 9 comments

sorted by: hot top controversial new old

[-] hendrik@palaver.p3x.de 9 points 1 month ago

Hehe, namedropping "AGI" in the very first paragraph and then going ahead with an AI that is super tailored to a narrow task like formal math proofs and geometry...

[-] morrowind@lemmy.ml 1 points 1 month ago

It could be used in a mixture of experts type situation

[-] hendrik@palaver.p3x.de 3 points 1 month ago* (last edited 1 month ago)

Sure. But none of this is about that. And I somehow doubt that'll be the path towards AGI anyways. Does any combination of narrow abilities become general at some point? Is the sum more than it's parts? I think so. Especially with intelligence.
And MoE comes with the issue that it can't really apply knowledge from one domain to another. At least if you separate the subjects. Whereas I as an general intelligent being can apply my math skills to engineering, coding, doing a plethora of every day tasks. So I'm not sure if MoE help with that.

[-] technocrit@lemmy.dbzer0.com 4 points 1 month ago* (last edited 1 month ago)

Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.

The first sentence is completely irrelevant grifting. Red flag.

First, the problems were manually translated into formal mathematical language for our systems to understand... Our systems solved one problem within minutes and took up to three days to solve the others.

LMAO. If people translate the question into symbols, then ofc a computer can solve the problem in a few minutes.

If you translate your budget into a spreadsheet, then a computer can calculate your surplus or deficit in microseconds. But the actually hard part is making the spreadsheet.

AlphaProof solved two algebra problems and one number theory problem... two combinatorics problems remained unsolved.

So they got a 60%? That's pretty good for a human but not so much for a purported "AI".

I imagine it might not be difficult to develop classes of problems that are easy or hard/impossible for automated proof. Probably already exists but the grift don't want to talk about limitations.

[-] morrowind@lemmy.ml 0 points 1 month ago

Did you actually look at the problems or even furher down the page before making these sweeping statements? Simply transforming it into formal mathematical language does not make the problems trivial. These aren't arithmetic problems.

Despite failing the two problems, it did better than the majority of the contestants, who are some of the most talented math students in the world.

The only major catch was it did not finish in the alloted time, since it went on for days. But once the method has been established, that's a performance problem.

Deepmind is one of the most respected labs in the AI space, far before the modern generative ai trend. They're not some random grifters.

[-] mrroman@lemmy.world 3 points 1 month ago

I wonder if they are sure that similar exercises weren’t in the learning set for AI. Such competitions have usually kind of patterns for exercises and people usually learn to them by resolving a large number of exercises to catch the pattern.

[-] morrowind@lemmy.ml 1 points 1 month ago

Very likely. No worse than a human in that case

[-] morrowind@lemmy.ml 2 points 1 month ago

I know there's a strong anti AI sentiment on lemmy, but I would advise reading at least the article, if not more details before denouncing it

[-] technocrit@lemmy.dbzer0.com 3 points 1 month ago

It should be denounced from the first sentence.

this post was submitted on 26 Jul 2024

4 points (54.8% liked)

Technology

57904 readers

4517 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS