Can modern LLMs count the number of b's in "blueberry"? (minimaxir.com)

submitted 3 days ago by rssbot@lemmy.bestiver.se to c/hackernews@lemmy.bestiver.se

28 comments fedilink hide all child comments

top 28 comments

sorted by: hot top controversial new old

[-] zeropointone@lemmy.world 11 points 3 days ago

As if a stochastic parrot could ever be able to count or calculate...

LLM work really well. You get something out of it that resembles human language. This is what it has been designed for and nothing else. Stop trying to make a screwdriver shoot laser beams, it's not going to happen.

[-] theunknownmuncher@lemmy.world -2 points 2 days ago* (last edited 2 days ago)

As if a stochastic parrot could ever be able to count or calculate…

https://www.anthropic.com/research/tracing-thoughts-language-model

well that's provably false as the neural network can be traced to understand the exact process and calculations that it follows when doing math. See the above research under heading "Mental Math".

The reason LLMs struggle to count letters is because of tokenization. The text is first converted to tokens, numeric vectors which represents whole words or parts of words, before being given to the LLM, and the LLM has no concept of or access to the underlying letters that make up the words. The LLM outputs only tokens, which are converted back into text.

EDIT: you can downvote me, but facts are facts lol.

[-] jacksilver@lemmy.world 5 points 2 days ago

There is definitely more going on with LLMs than just direct parroting.

However, there is also an upper limit to what an LLM can logic/calculate. Since LLMs basically boil down to a series of linear operations, there is an upper bound on all of them as to how accurately they can calculate anything.

Most chat systems use python behind the scene for any actual math, but if you run a raw LLM you can see the errors grow faster as you look at higher orders of growth (addition, multiplication, powers, etc.).

[-] theunknownmuncher@lemmy.world 0 points 2 days ago* (last edited 2 days ago)

Yes, exactly. It can do basic math and also doesn't really matter because it is really good at interfacing with tools/functions for calculation anyway

However, there is also an upper limit to what an LLM can logic/calculate. Since LLMs basically boil down to a series of linear operations, there is an upper bound on all of them as to how accurately they can calculate anything.

Also this is only true when LLMs are not using Chain of Thought.

[-] Onomatopoeia@lemmy.cafe 2 points 2 days ago

So what you're saying is it can't count.

Got it.

[-] theunknownmuncher@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

Reading comprehension. The only thing it cannot count are specifically letters of tokenized words. If you separate the letters out into separate tokens ("B L U E B E R R Y", etc) it will get it correct 100% of the time.

[-] zeropointone@lemmy.world 0 points 2 days ago

What makes you think that using single letters as tokens instead could teach a stochastic parrot to count or calculate? Both are abilities. You can't create an ability only from a set of data no matter how much data you have. You can only make a model seem to have that ability. Again: All you can ever get out of it is something that resembles human language. There is nothing beyond/behind that, by design. Not even hallucinations. Whenever a LLM gives you the advice to eat a rock per day it still works. Because it outputs a correct sounding sentence purely and entirely based on probability. But counting and calculating are not based on probability which is something everyone who ever had a math class knows very well. No math teacher will let students guess the result of an equation.

[-] theunknownmuncher@lemmy.world 2 points 2 days ago

What makes you think that using single letters as tokens instead could teach a stochastic parrot to count or calculate? Both are abilities. You can’t create an ability only from a set of data no matter how much data you have. You can only make a model seem to have that ability.

Yeah, that's just not how neural networks work...

No math teacher will let students guess the result of an equation.

https://en.wikipedia.org/wiki/Regression_analysis

You tried.

[-] zeropointone@lemmy.world -3 points 2 days ago

So you don't know how math works.

[-] theunknownmuncher@lemmy.world 4 points 2 days ago* (last edited 2 days ago)

https://en.wikipedia.org/wiki/Universal_approximation_theorem Actual math directly contradicts your beliefs. I know its trendy and you want to feel smarter than people who have spent literally decades researching NNs and staked billions of dollars developing it, but you're wrong.

Your claim is like claiming that boolean circuits cannot do math because all they do is "true" or "false".

And also, https://en.wikipedia.org/wiki/Recurrent_neural_network#Neural_Turing_machines recurrent neural networks are Turing complete when paired with memory, and therefore can be used to any calculations or computations that a conventional computer can do.

[-] zeropointone@lemmy.world -1 points 2 days ago

And you don't know what a circular argument is either...

No, 2+2 is never "about 4" nor is it 4 in most cases. It's always exactly 4. And no LLM can ever come to this conclusion. LLMs fail at math in a truly spectacular way. Just like no LLM will ever be able to understand what complementary colors are. Which is one of my favorite tests because it has a 100 % error rate.

[-] theunknownmuncher@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

Alright, well you've already been corrected with sources and facts, and yet are still repeating misinformation. You can choose to ignore reality and remain incorrect and ignorant.

Just like no LLM will ever be able to understand what complementary colors are. Which is one of my favorite tests because it has a 100 % error rate.

LOL

The funniest part of this is not the fact that an LLM just got 3 for 3 correct, and therefore has a 100% success rate, not a 100% error rate, but the fact that your favorite test would be one that you incorrectly believe "no LLM will ever be able to" do because....

LLM work really well. You get something out of it that resembles human language. This is what it has been designed for and nothing else. Stop trying to make a screwdriver shoot laser beams, it’s not going to happen.

^ this you??? "My favorite test is to see if the screwdriver shoots laser beams" 🙃

[-] zeropointone@lemmy.world -1 points 2 days ago

And why didn't you include the name of the model in your test? Looks like you don't want me to try it myself. It would be interesting to do so. Of course with values which don't fit perfectly into 8 bit. What if I define the range from 0 to 47204 for each color channel instead? What if I would use CMY(K) instead of RGB? A good "great" AI must be able to handle all of that. And of course correctly explain what complementary colors are (which you didn't include either). So yeah - what you provided does not go beyond the output from htmlcolorcodes.com - a very simple website with very simple code. I doubt it requires much power either.

[-] theunknownmuncher@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

And why didn’t you include the name of the model in your test?

~~I was using standard RGB hex codes, so I didn't really need to specify because its the assumed default. If it was something different, I would need to specify.~~ EDIT: oh I just realized you meant the LLM model, not the color model (RYB vs RGB). It was just from ChatGPT, thought the interface would be recognizable enough.

Looks like you don’t want me to try it myself. It would be interesting to do so.

Huh? What do you mean? Go try it!

Of course with values which don’t fit perfectly into 8 bit. What if I define the range from 0 to 47204 for each color channel instead

Yeah, so this is already a thing. 24-bit color (8 bits per color channel) already gives you 16,777,216 colors, which is pretty good, but if you want more precision, you can just use decimal (floating point) numbers for each channel, like sRGB(0.25, 0.5, 1.0) (https://en.wikipedia.org/wiki/SRGB) OR even better would be to use oklch (https://en.wikipedia.org/wiki/Oklab_color_space). This is a solved problem. Or you cold just define your range as 0 to 47204.

So... we've gone from "no LLM will ever be able to understand what complementary colors are" to "b-b-but what about arbitrary color models I make up??" And yeah, it will handle those too, you just have to tell it what it is when you prompt it.

[-] zeropointone@lemmy.world -1 points 2 days ago

All LLMs still claim that green is the complementary color to red...

[-] theunknownmuncher@lemmy.world 3 points 2 days ago* (last edited 2 days ago)

Green is the correct answer in the RYB color model, which is traditionally used in art and most commonly taught in schools.

And... wait for it...

And an open-weight model (qwen3:32b)

So you're:

[-] zeropointone@lemmy.world -2 points 2 days ago

And you still ignore what I wrote. Because you can't process how wrong you and your AI are.

[-] theunknownmuncher@lemmy.world 4 points 2 days ago* (last edited 2 days ago)

😂 multiple LLMs literally gave the exact answer that you claim they can't correctly give, on the very first try. Checkmate.

[-] zeropointone@lemmy.world -3 points 2 days ago

Funny. Each time I ask any LLM what the complementary color to red is. Then I always get green as answer instead of cyan (With cyan being the only correct answer). And a completely wrong explanation about what complementary colors are based on digital screens. So yeah - LLMs still fail miserably at language-based tests. And rearranging complex equations doesn't work either.

[-] theunknownmuncher@lemmy.world 3 points 2 days ago* (last edited 2 days ago)

Each time I ask any LLM what the complementary color to red is. Then I always get green as answer instead of cyan (With cyan being the only correct answer). And a completely wrong explanation about what complementary colors are based on digital screens.

🤦 Oh... oh wow, I was giving you way more credit than what you actually meant. You do realize there is more than one color model? https://en.wikipedia.org/wiki/Complementary_colors#In_different_color_models You probably should read the explanation about complementary colors based on digital screens that they are providing to you (or just pay attention in elementary art class), because you might actually learn something new.

Red Yellow Blue and Cyan Yellow Magenta are both subtractive color models. RGB is an additive color model.

https://en.wikipedia.org/wiki/RYB_color_model

Try giving the LLM the hex color code and the color model you're using that code in, and it will give you the correct complementary color.

[-] zeropointone@lemmy.world -3 points 2 days ago

And you showed that you don't understand complementary colors, just like AI. Because the above color circle is wrong. Why? Because of tests like the afterimage test (Example: https://i.pinimg.com/originals/da/7c/fb/da7cfba87ffdc8f426953397162329b4.gif), proving that purple (like pictured above) can never be the complementary color to yellow, it always has to be a deep blue. It doesn't matter if it's additive colors or subtractive colors you're using (Afterimage tests work both passive and active) because in the end, it's all only about light hitting our L/M/S-cones and how our brains work when it comes to interpreting the signals from those cones (https://en.m.wikipedia.org/wiki/Metamerism_(color). Metamerism explains why engineers chose perceptually equidistant cyan/magenta/yellow for (simple) printing ("Subtractive colors") and perceptually equidistant red/green/blue for active emitting devices like cameras and displays ("Active colors"). And if you now say "But bro, I see a green shifting towards blue in the afterimage test" - didn't you wonderful AI tell you about the Abney effect? Weird. It's all well known and documented on the web which has been used to train your wonderful AI. But yeah - without being able to understand all of that, there is no way your wonderful AI can tell you which one of all those color circles is the correct one (And there is only one because it does not violate the CIE 1931 color space). It's up to you to either learn and understand - or to blindly follow a LLM which sticks to green being the complementary color to red. Because all the LLM can do is repeating the garbage it has been trained with. Because it's nothing more than a stochastic parrot. Your choice.

[-] theunknownmuncher@lemmy.world 3 points 2 days ago* (last edited 2 days ago)

There is no one "correct" color circle. And your misguided beliefs about color theory do not have anything to do with LLMs.

By the way, they're called "additive colors", not "active colors". 🙃

[-] zeropointone@lemmy.world -2 points 2 days ago

Additive colors -> active light emitter. Which should be obvious. But yeah, you simply lack the ability to think beyond what AI tells you. You understand nothing. You're nothing mote than a stochastic parrot yourself. Enjoy your daily rock.

[-] theunknownmuncher@lemmy.world 4 points 2 days ago* (last edited 2 days ago)

😂 alright well, you've been corrected and proven wrong, with sources and screenshots. And clearly you're getting a teeny bit upset over it. Sorry! There's nothing wrong with learning something new, and its okay that you had made a mistake.

[-] picnicolas@slrpnk.net 3 points 2 days ago

This was a fun rabbit hole to go down! I tend to agree with most of the takes here on Lemmy but the complete AI derision is pretty wild and unfounded in reality. I have plenty of concerns about the tech but to say it’s useless you’d have to really not even have tried it out to see for yourself. I appreciate your patience, dedication to the truth, willingness to explain, and experimental attitude here.

[-] theunknownmuncher@lemmy.world 2 points 2 days ago

This was a super bizarre case of "the neutral networks that are a predictive model of all of the world's knowledge don't share my belief that only one specific color model is valid, therefore they suck" 😂

[-] theunknownmuncher@lemmy.world 2 points 2 days ago

[-] fubarx@lemmy.world 4 points 2 days ago

At this point, wouldn't be surprised if they just hardcoded the strawberry/blueberry answer into the server.

this post was submitted on 12 Aug 2025

20 points (91.7% liked)

Hacker News

2200 readers

455 users here now

Posts from the RSS Feed of HackerNews.

The feed sometimes contains ads and posts that have been removed by the mod team at HN.

founded 11 months ago

MODERATORS

patrick@lemmy.bestiver.se

rssbot@lemmy.bestiver.se