The Collapse of GPT: Will future artificial intelligence systems perform increasingly poorly due to AI-generated material in their training data? (cacm.acm.org)

submitted 2 months ago by Pro@programming.dev to c/technology@lemmy.world

35 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] andallthat@lemmy.world 2 points 2 months ago* (last edited 2 months ago)

Basically, model collapse happens when the training data no longer matches real-world data

I'm more concerned about LLMs collaping the whole idea of "real-world".

I'm not a machine learning expert but I do get the basic concept of training a model and then evaluating its output against real data. But the whole thing rests on the idea that you have a model trained with relatively small samples of the real world and a big, clearly distinct "real world" to check the model's performance.

If LLMs have already ingested basically the entire information in the "real world" and their output is so pervasive that you can't easily tell what's true and what's AI-generated slop "how do we train our models now" is not my main concern.

As an example, take the judges who found made-up cases because lawyers used a LLM. What happens if made-up cases are referenced in several other places, including some legal textbooks used in Law Schools? Don't they become part of the "real world"?

[-] WanderingThoughts@europe.pub 0 points 2 months ago

LLM are not going to be the future. The tech companies know it and are working on reasoning models that can look up stuff to fact check themselves. These are slower, use more power and are still a work in progress.

[-] andallthat@lemmy.world -1 points 2 months ago

Look up stuff where? Some things are verifiable more or less directly: the Moon is not 80% made of cheese,adding glue to pizza is not healthy, the average human hand does not have seven fingers. A "reasoning" model might do better with those than current LLMs.

But for a lot of our knowledge, verifying means "I say X because here are two reputable sources that say X". For that, having AI-generated text creeping up everywhere (including peer-reviewed scientific papers, that tend to be considered reputable) is blurring the line between truth and "hallucination" for both LLMs and humans

[-] Aux@feddit.uk -1 points 2 months ago

Who said that adding glue to pizza is not healthy? Meat glue is used in restaurants all the time!

this post was submitted on 17 May 2025

27 points (100.0% liked)

Technology

73759 readers

548 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws