ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic (www.tomshardware.com)

submitted 2 months ago by Lifecoach5000@lemmy.world to c/technology@lemmy.world

20 comments fedilink hide all child comments

top 20 comments

sorted by: hot top controversial new old

[-] untakenusername@sh.itjust.works 2 points 2 months ago

this is because an LLM is not made for playing chess

[-] seven_phone@lemmy.world 1 points 2 months ago

You say you produce good oranges but my machine for testing apples gave your oranges a very low score.

[-] wizardbeard@lemmy.dbzer0.com -1 points 2 months ago

No, more like "Your marketing team, sales team, the news media at large, and random hype men all insist your orange machine works amazing on any fruit if you know how to use it right. It didn't work my strawberries when I gave it all the help I could, and was outperformed by my 40 year old strawberry machine. Please stop selling the idea it works on all fruit."

This study is specifically a counter to the constant hype that these LLMs will revolutionize absolutely everything, and the constant word choices used in discussion of LLMs that imply they have reasoning capabilities.

[-] NeilBru@lemmy.world 1 points 2 months ago* (last edited 2 months ago)

An LLM is a poor computational/predictive paradigm for playing chess.

[-] Bleys@lemmy.world 0 points 2 months ago

The underlying neural network tech is the same as what the best chess AIs (AlphaZero, Leela) use. The problem is, as you said, that ChatGPT is designed specifically as an LLM so it’s been optimized strictly to write semi-coherent text first, and then any problem solving beyond that is ancillary. Which should say a lot about how inconsistent ChatGPT is at solving problems, given that it’s not actually optimized for any specific use cases.

[-] NeilBru@lemmy.world 2 points 2 months ago* (last edited 2 months ago)

Yes, I agree wholeheartedly with your clarification.

My career path, as I stated in a different comment in regards to neural networks, is focused on generative DNNs for CAD applications and parametric 3D modeling. Before that, I began as a researcher in cancerous tissue classification and object detection in medical diagnostic imaging.

Thus, large language models are well out of my area of expertise in terms of the architecture of their models.

However, fundamentally it boils down to the fact that the specific large language model used was designed to predict text and not necessarily solve problems/play games to "win"/"survive".

(I admit that I'm just parroting what you stated and maybe rehashing what I stated even before that, but I like repeating and refining in simple terms to practice explaining to laymen and, dare I say, clients. It helps me feel as if I don't come off too pompously when talking about this subject to others; forgive my tedium.)

[-] jsomae@lemmy.ml 1 points 2 months ago

Using an LLM as a chess engine is like using a power tool as a table leg. Pretty funny honestly, but it's obviously not going to be good at it, at least not without scaffolding.

[-] stevedice@sh.itjust.works 1 points 2 months ago

2025 Mazda MX-5 Miata 'got absolutely wrecked' by Inflatable Boat in beginner's boat racing match — Mazda's newest model bamboozled by 1930s technology.

[-] Halosheep@lemm.ee 0 points 2 months ago

I swear every single article critical of current LLMs is like, "The square got BLASTED by the triangle shape when it completely FAILED to go through the triangle shaped hole."

[-] drspod@lemmy.ml 0 points 2 months ago

It's newsworthy when the sellers of squares are saying that nobody will ever need a triangle again, and the shape-sector of the stock market is hysterically pumping money into companies that make or use squares.

[-] MrSqueezles@lemmy.world 1 points 2 months ago

The press release where OpenAI said we'd never need chess players again

[-] inconel@lemmy.ca 1 points 2 months ago

It's also from a company claiming they're getting closer to create morphing shape that can match any hole.

[-] AlecSadler@sh.itjust.works 0 points 2 months ago

ChatGPT has been, hands down, the worst AI coding assistant I've ever used.

It regularly suggests code that doesn't compile or isn't even for the language.

It generally suggests AC of code that is just a copy of the lines I just wrote.

Sometimes it likes to suggest setting the same property like 5 times.

It is absolute garbage and I do not recommend it to anyone.

[-] nutsack@lemmy.dbzer0.com 0 points 2 months ago

my favorite thing is to constantly be implementing libraries that don't exist

[-] jj4211@lemmy.world 1 points 2 months ago

Oh man, I feel this. A couple of times I've had to field questions about some REST API I support and they ask why they get errors when they supply a specific attribute. Now that attribute never existed, not in our code, not in our documentation, we never thought of it. So I say "Well, that attribute is invalid, I'm not sure where you saw to do that". They get insistent that the code is generated by a very good LLM, so we must be missing something...

[-] Objection@lemmy.ml 0 points 2 months ago

Tbf, the article should probably mention the fact that machine learning programs designed to play chess blow everything else out of the water.

[-] arc99@lemmy.world 0 points 2 months ago

Hardly surprising. Llms aren't -thinking- they're just shitting out the next token for any given input of tokens.

[-] stevedice@sh.itjust.works 0 points 2 months ago

That's exactly what thinking is, though.

[-] arc99@lemmy.world 1 points 2 months ago* (last edited 2 months ago)

An LLM is an ordered series of parameterized / weighted nodes which are fed a bunch of tokens, and millions of calculations later result generates the next token to append and repeat the process. It's like turning a handle on some complex Babbage-esque machine. LLMs use a tiny bit of randomness ("temperature") when choosing the next token so the responses are not identical each time.

But it is not thinking. Not even remotely so. It's a simulacrum. If you want to see this, run ollama with the temperature set to 0 e.g.

ollama run gemma3:4b
>>> /set parameter temperature 0
>>> what is a leaf

You will get the same answer every single time.

[-] stevedice@sh.itjust.works 1 points 1 month ago* (last edited 1 month ago)

I know what an LLM is doing. You don't know what your brain is doing.

this post was submitted on 09 Jun 2025

10 points (91.7% liked)

Technology

73884 readers

760 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws