869

A shocking story was promoted on the "front page" or main feed of Elon Musk's X on Thursday:

"Iran Strikes Tel Aviv with Heavy Missiles," read the headline.

This would certainly be a worrying world news development. Earlier that week, Israel had conducted an airstrike on Iran's embassy in Syria, killing two generals as well as other officers. Retaliation from Iran seemed like a plausible occurrence.

But, there was one major problem: Iran did not attack Israel. The headline was fake.

Even more concerning, the fake headline was apparently generated by X's own official AI chatbot, Grok, and then promoted by X's trending news product, Explore, on the very first day of an updated version of the feature.

you are viewing a single comment's thread
view the rest of the comments
[-] wizardbeard@lemmy.dbzer0.com 55 points 7 months ago

Yep. To add on, this is exactly what all the "AI haters" (myself included) are going on about when they say stuff like there isn't any logic or understanding behind LLMs, or when they say they are stochastic parrots.

LLMs are incredibly good at generating text that works grammatically and reads like it was put together by someone knowledgable and confident, but they have no concept of "truth" or reality. They just have a ton of absurdly complicated technical data about how words/phrases/sentences are related to each other on a structural basis. It's all just really complicated math about how text is put together. It's absolutely amazing, but it is also literally and technologically impossible for that to spontaneously coelesce into reason/logic/sentience.

Turns out that if you get enough of that data together, it makes a very convincing appearance of logic and reason. But it's only an appearance.

You can't duct tape enough speak and spells together to rival the mass of the Sun and have it somehow just become something that outputs a believable human voice.


For an incredibly long time, ChatGPT would fail questions along the lines of "What's heavier, a pound of feathers or three pounds of steel?" because it had seen the normal variation of the riddle with equal weights so many times. It has no concept of one being smaller than three. It just "knows" the pattern of the "correct" response.

It no longer fails that "trick", but there's significant evidence that OpenAI has set up custom handling for that riddle over top of the actual LLM, as it doesn't take much work to find similar ways to trip it up by using slightly modified versions of classic riddles.

A lot of supporters will counter "Well I just ask it to tell the truth, or tell it that it's wrong, and it corrects itself", but I've seen plenty of anecdotes in the opposite direction, with ChatGPT insisting that it's hallucination was fact. It doesn't have any concept of true or false.

[-] neatchee@lemmy.world 22 points 7 months ago* (last edited 7 months ago)

The shame of it is that despite this limitation LLMs have very real practical uses that, much like cryptocurrencies and NFTs did to blockchain, are being undercut by hucksters.

Tesla has done the same thing with autonomous driving too. They claimed to be something they're not (fanboys don't @ me about semantics) and made the REAL thing less trusted and take even longer to come to market.

Drives me crazy.

[-] FlashMobOfOne@lemmy.world 8 points 7 months ago

Yup, and I hate that.

I really would like to one day just take road trips everywhere without having to actually drive.

[-] neatchee@lemmy.world 4 points 7 months ago* (last edited 7 months ago)

Right? Waymo is already several times safer than humans and tesla's garbage, yet municipalities keep refusing them. Trust is a huge problem for them.

And yes, haters, I know that they still have problems in inclement weather but that's kinda the point: we would be much further along if it weren't for the unreasonable hurdles they keep facing because of fear created by Tesla

[-] FlashMobOfOne@lemmy.world 2 points 7 months ago

Hadn't heard of this. Thanks!

[-] humorlessrepost@lemmy.world 2 points 7 months ago* (last edited 7 months ago)

For road trips (i.e. interstates and divided highways), GM’s Super Cruise is pretty much there unless you go through a construction zone. I just went from Atlanta to Knoxville without touching the steering wheel once.

[-] FlashMobOfOne@lemmy.world 1 points 7 months ago

I'll look into that when my Kia passes away. Thank you!

[-] yessikg@lemmy.blahaj.zone 0 points 7 months ago

Trains are really good for that

[-] FlashMobOfOne@lemmy.world 1 points 7 months ago

You can't road trip in a train.

[-] cygon@lemmy.world 9 points 7 months ago

I love that example. Microsoft's Copilot (based on GTP-4) immediately doesn't disappoint:

Microsoft Copilot: Two pounds of feathers and a pound of lead both weigh the same: two pounds. The difference lies in the material—feathers are much lighter and less dense than lead. However, when it comes to weight, they balance out equally.

It's annoying that for many things, like basic programming tasks, it manages to generate reasonable output that is good enough to goat people into trusting it, yet hallucinates very obviously wrong stuff or follows completely insane approaches on anything off the beaten path. Every other day, I have to spend an hour to justify to a coworker why I wrote code this way when the AI has given him another "great" suggestion, like opening a hidden window with an UI control to query a database instead of going through our ORM.

[-] rottingleaf@lemmy.zip 6 points 7 months ago

but it is also literally and technologically impossible for that to spontaneously coelesce into reason/logic/sentience

Yeah, see, one very popular modern religion (without official status or need for one to explicitly identify with id, but really influential) is exactly about "a wonderful invention" spontaneously emerging in the hands of some "genius" who "thinks differently".

Most people put this idea far above reaching your goal after making myriad of small steps, not skipping a single one.

They also want a magic wand.

The fans of "AI" today are deep inside simply luddites. They want some new magic to emerge to destroy the magic they fear.

[-] Akisamb@programming.dev 1 points 7 months ago

It's absolutely amazing, but it is also literally and technologically impossible for that to spontaneously coelesce into reason/logic/sentience.

This is not true. If you train these models on game of Othello, they'll keep a state of the world internally and use that to predict the next move played (1). To execute addition and multiplication they are executing an algorithm on which they were not explicitly trained (although the gpt family is surprisingly bad at it, due to a badly designed tokenizer).

These models are still pretty bad at most reasoning tasks. But training on predicting the next word is a perfectly valid strategy, after all the best way to predict what comes after the "=" in 1432 + 212 = is to do the addition.

[-] PopShark@lemmy.world 1 points 7 months ago

Yep the hallucinations issue happens even in GPT4, in my experience certain topics can bring about potential hallucinations more than others but if ChatGPT (even with GPT4 or whatever other advanced version of it) gets “stuck” on believing its hallucinations the only way to convince it is literally plainly stating the part that’s wrong and directing it to search Bing or the internet some other way specifically for that. Otherwise you just let out a sigh and start a new chat. If you spend too much time negotiating with it that wastes tokens anyway so the chat becomes bloated and it forgets stuff from earlier in the chat, not to mention technically you’re paying for being able to use the more advanced model anyway and yeah basically the more you treat the chat like a normal conversation the worse it is with AI. I guess that’s why “prompt engineering” was or is a thing, whether legitimate or not.

I did also importantly note that if you pay for credits with OpenAI to use their “playground” to create a specifically customized GPT4 adjusting temperature and response types it takes getting used to because it is WAY different than ChatGPT regardless of which version of GPT you have it set to. It actually kind of blew me away with how much better it “””understood””” software development but the issue is you kind of have to set up chats yourself it’s more complex and you pay per token so mistakes cost you. If it wasn’t such a pain and I had a specific use case I would definitely rather pay for OpenAI credits as needed than their bs “Plus” $20/month subscription for nerfed GPT4 as a chatbot.

this post was submitted on 05 Apr 2024
869 points (96.2% liked)

Technology

59080 readers
3575 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS