713

Wikipedia is under assault: rogue users keep posting AI generated nonsense (www.techspot.com)

submitted 2 years ago* (last edited 2 years ago) by ForgottenFlux@lemmy.world to c/technology@lemmy.world

89 comments fedilink hide all child comments

Wikipedia has a new initiative called WikiProject AI Cleanup. It is a task force of volunteers currently combing through Wikipedia articles, editing or removing false information that appears to have been posted by people using generative AI.

Ilyas Lebleu, a founding member of the cleanup crew, told 404 Media that the crisis began when Wikipedia editors and users began seeing passages that were unmistakably written by a chatbot of some kind.

top 50 comments

sorted by: hot top controversial new old

[-] schizo@forum.uncomfortable.business 238 points 2 years ago

Further proof that humanity neither deserves nor is capable of having nice things.

Who would set up an AI bot to shit all over the one remaining useful thing on the Internet, and why?

I'm sure the answer is either 'for the lulz' or 'late-stage capitalism', but still: historically humans aren't usually burning down libraries on purpose.

[-] poszod@lemmy.world 116 points 2 years ago

State actors could be interested in doing that. Same with the internet archive attacks.

[-] Schmoo@slrpnk.net 98 points 2 years ago

historically humans aren't usually burning down libraries on purpose.

How on earth have you come to this conclusion.

[-] sugar_in_your_tea@sh.itjust.works 36 points 2 years ago

To be fair, it's usually to effect cultural genocide. It's not average people burning libraries, it's usually some kind of authoritarian regime.

[-] SacralPlexus@lemmy.world 34 points 2 years ago* (last edited 2 years ago)

* looks around and gestures broadly in agreement*

[-] OpenStars@discuss.online 18 points 2 years ago

Florida says hello. A bunch of other places too, sadly:-(.

[-] Regrettable_incident@lemmy.world 13 points 2 years ago

historically humans aren't usually burning down libraries on purpose.

Sometimes they are, Baghdad springs to mind, I'm sure there are other examples. And this library is online so there's less chance of getting caught with a can of petrol and a box of matches.

Then there's every authoritarian regime that tries to ban or burn specific types of books. What we're seeing here could be more like that - an attempt to muddy the waters or introduce misinformation on certain topics.

[-] Wrench@lemmy.world 9 points 2 years ago

Because basement losers can't conquer and raze libraries to the ground.

The internet has shown that assumed anonymity result in people fucking with other people's lives for the hell of it. Viruses, trolling, etc. This is just the next stage of it because of a new easy to use tool.

load more comments (11 replies)

[-] sbv@sh.itjust.works 115 points 2 years ago

As for why this is happening, the cleanup crew thinks there are three primary reasons.

"[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,"

That last one. Ouch.

[-] givesomefucks@lemmy.world 48 points 2 years ago

The vast majority of people think they're the good guys...

[-] TimLovesTech@badatbeing.social 34 points 2 years ago

“[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive,”

I think the main driver behind people misinformed about AI content comes from the fact that outside of tech people, most have no idea that AI will:

100% make up answers to things it doesn't know because either the sample size of data they have ingested was to small or was bad. And it will do this with the same robot confidence you get for any other answer.
AI that has been fed to much other AI generated content will begin to "hallucinate" and give some wild outputs, very similar to humans suffering from schizophrenia. And again these answers will be given as "fact" with the same robotic confidence.

load more comments (2 replies)

load more comments (1 replies)

[-] kibiz0r@midwest.social 69 points 2 years ago

Unleashing generative AI on the world was basically the information equivalent of jumping headfirst into Kessler Syndrome.

[-] khannie@lemmy.world 44 points 2 years ago

For the uninitiated like me:

The Kessler syndrome (also called the Kessler effect,[1][2] collisional cascading, or ablation cascade), proposed by NASA scientists Donald J. Kessler and Burton G. Cour-Palais in 1978, is a scenario in which the density of objects in low Earth orbit (LEO) due to space pollution is numerous enough that collisions between objects could cause a cascade in which each collision generates space debris that increases the likelihood of further collisions.

Wikipedia link.

[-] kibiz0r@midwest.social 19 points 2 years ago

Good call, thank you.

Also: Referencing Wikipedia in this context is kinda funny.

[-] khannie@lemmy.world 11 points 2 years ago

I did think that. :) It's just.... So good. I hope it never enshitifies. God help us.

[-] narc0tic_bird@lemm.ee 52 points 2 years ago

Best case is that the model used to generate this content was originally trained by data from Wikipedia so it "just" generates a worse, hallucinated "variant" of the original information. Goes to show how stupid this idea is.

Imagine this in a loop: AI trained by Wikipedia that then alters content on Wikipedia, which in turn gets picked up by the next model trained. It would just get worse and worse, similar to how converting the same video over and over again yields continuously worse results.

[-] huginn@feddit.it 24 points 2 years ago

See also: model collapse

(Which is more or less just regression towards the mean with more steps)

[-] Wrench@lemmy.world 15 points 2 years ago

Yes, this is what many of us worry will become the internet in general. AI content generated on from AI trained on AI garbage.

AI bots can trivially outpace humans.

[-] kboy101222@sh.itjust.works 11 points 2 years ago

I was just discussing with a friend of mine how we're rapidly approaching the dead internet. At some point, many websites will likely just be chat bots talking to other chat bots, which then gets used to train further chat bots. Human made content is already becoming harder and harder to find on algorithm heavy websites like Reddit and facebooks suite of sites. The bots can easily outpace any algorithmic changes they might make to help deter them, but my fb using family members all constantly block those weird Jesus accounts and they still show up constantly

[-] captain_aggravated@sh.itjust.works 7 points 2 years ago

Eventually every article just reads "Delve delve delve delve delve delve delve."

load more comments (3 replies)

[-] TheGrandNagus@lemmy.world 43 points 2 years ago

Jesus Christ. The amount of absolute bellends in the world never ceases to confound me.

[-] Bahnd@lemmy.world 34 points 2 years ago

They used to be contained, every village has their idiot. Now that the internet is the global village, all the formerly isolated idiots have a place to chat.

[-] 96VXb9ktTjFnRi@feddit.nl 42 points 2 years ago* (last edited 2 years ago)

Sabotage Wikipedia, Ddos the Internet Archive. Makes you wonder if in the future we're going to forget our past. Will actual history be obscured in a sea of alternative histories unrecognizably presented as the same thing. Maybe we need to keep some books laying around in archives just to be sure.

[-] TachyonTele@lemm.ee 16 points 2 years ago* (last edited 2 years ago)

The digital dark age will be a real thing, absolutely.

Interesting idea on a sea of alternative histories. That might be a possible threat.
Someone else here called it "AI text apocalypse". I like that term.

load more comments (1 replies)

[-] randon31415@lemmy.world 41 points 2 years ago

If anyone can survive the AI text apocalypse, it is wikipedia. They have been fending off and regulating article writing bots since someone coded up a US town article writer from the 2000 census (not the 2010 or 2020 census, the 2000 census. This bot was writing wikipedia articles in 2003)

[-] T156@lemmy.world 8 points 2 years ago

Hopefully they tightened things up after the Scots incident.

load more comments (4 replies)

load more comments (1 replies)

[-] lolola@lemmy.blahaj.zone 37 points 2 years ago

I hate to post because I have loved and trusted Wikipedia for years, but the fact that there are folks out there who equally trust what AI tools generate just baffles me.

[-] Dragonstaff@leminal.space 8 points 2 years ago

The signal to noise ratio is so low these days. There's so much information out there but everyone wants to profit from you before you can get it. Even worse, the people with good information generally can't buy as big a megaphone as the people who profit from lying to you.

Honestly, I think humans have been more likely to believe an easy lie than a hard truth all along, but it's easier than ever these days.

[-] nutsack@lemmy.world 31 points 2 years ago

why the fuck would anyone stick ai shit on wikipedia that doesn't make any sense

[-] NateNate60@lemmy.world 35 points 2 years ago

"[The] main reasons that motivate editors to add AI-generated content: self-promotion, deliberate hoaxing, and being misinformed into thinking that the generated content is accurate and constructive," Lebleu said.

[-] nutsack@lemmy.world 13 points 2 years ago

so, stupidity basically. they're just stupid.

[-] realitista@lemm.ee 10 points 2 years ago

Many people who are trying to push lies have an agenda to undermine Wikipedia. Trump, Putin supporters, etc.

[-] MellowSnow@lemmy.world 14 points 2 years ago

People suck

[-] InverseParallax@lemmy.world 7 points 2 years ago

The irony being a huge amount of the llm knowledge was based on WP in the first place, that and scientific papers.

[-] xnx@slrpnk.net 30 points 2 years ago* (last edited 2 years ago)

Download the torrent for the local copy of wikipedia from 2024 now

https://kiwix.org/

[-] varjen@lemmy.world 19 points 2 years ago

Or download it in a bunch of other ways directly from Wikipedia.

[-] Aatube@kbin.melroy.org 25 points 2 years ago

Don't worry, it's not as bad as the title suggests. The attack on Internet Archive is far, far worse. It's obviously a bit of a problem, though.

load more comments (2 replies)

[-] Kolanaki@yiffit.net 19 points 2 years ago

fights back by posting human-generated nonsense

load more comments (1 replies)

[-] WhatsHerBucket@lemmy.world 13 points 2 years ago

This is why we can’t have nice things

load more comments (1 replies)

[-] FeelzGoodMan420@eviltoast.org 12 points 2 years ago

AI is the buggest pile of dogshit to come out of tech in the history of the human race

load more comments (6 replies)

[-] Badeendje@lemmy.world 7 points 2 years ago

Require someone that wants to add stuff to pay a small amount to the Wikimedia Foundation for activating their account and refund it if they moderate a certain amount.

[-] aubertlone@lemmy.world 7 points 2 years ago

Yeah I mean I've had minor edits reversed because I didn't source the fact properly

And that was like 10 years ago I'm surprised these edits are getting through in the first place

load more comments (3 replies)

load more comments (1 replies)

load more comments

this post was submitted on 11 Oct 2024

713 points (99.4% liked)

Technology

84941 readers

643 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws