557

Microsoft Sued For AI Article Accusing Innocent Man of Sexual Misconduct (futurism.com)

submitted 1 year ago by catculation@lemmy.zip to c/technology@lemmy.world

51 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[-] dual_sport_dork@lemmy.world 226 points 1 year ago* (last edited 1 year ago)

Say it with me again now:

For fact-based applications, the amount of work required to develop and subsequently babysit the LLM to ensure it is always producing accurate output is exactly the same as doing the work yourself in the first place.

Always, always, always. This is a mathematical law. It doesn't matter how much you whine or argue, or cite anecdotes about how you totally got ChatGPT or Copilot to generate you some working code that one time. The LLM does not actually have comprehension of its input or output. It doesn't have comprehension, period. It cannot know when it is wrong. It can't actually know anything.

Sure, very sophisticated LLM's might get it right some of the time, or even a lot of the time in the cases of very specific topics with very good training data. But its accuracy cannot be guaranteed unless you fact-check 100% of its output.

Underpaid employees were asked to feed published articles from other news services into generative AI tools and spit out paraphrased versions. The team was soon using AI to churn out thousands of articles a day, most of which were never fact-checked by a person. Eventually, per the NYT, the website's AI tools randomly started assigning employees' names to AI-generated articles they never touched.

Yep, that right there. I could have called that before they even started. The shit really hits the fan when the computer is inevitably capable of spouting bullshit far faster than humans are able to review and debunk its output, and that's only if anyone is actually watching and has their hand on the off switch. Of course, the end goal of these schemes is to be able to fire as much of the human staff as possible, so it ultimately winds up that there is nobody left to actually do the review. And whatever emaciated remains of management are left don't actually understand how the machine works nor how its output is generated.

Yeah, I see no flaws in this plan... Carry the fuck on, idiots.

[-] RiikkaTheIcePrincess@pawb.social 58 points 1 year ago

Did you enjoy humans spouting bullshit faster than humans can debunk it? Well, brace for impact because here comes machine-generated bullshit! Wooooeee'refucked! 🥳

[-] dual_sport_dork@lemmy.world 29 points 1 year ago

To err is human. But to really fuck up, you need a computer.

[-] gravitas_deficiency@sh.itjust.works 5 points 1 year ago* (last edited 1 year ago)

A human can only do bad or dumb things so quickly.

A human writing code can do bad or dumb things at scale, as well as orders of magnitude more quickly.

[-] dual_sport_dork@lemmy.world 2 points 1 year ago

And untangling that clusterfuck can be damn near impossible.

The reaper may not present his bill immediately, but he will always present his bill eventually. This is a zero-sum thing: There is no net savings because the work required can be front loaded or back loaded, and you sitting there at the terminal in the present might not know. Yet.

There are three phases where time and effort are input, and wherein asses can be bitten either preemptively or after the fact:

Loading the algorithm with all the data. Where did all that data come from? In the case of LLM's, it came from an infinite number of monkeys typing on an infinite number of keyboards. That is, us. The system is front loaded with all of this time and effort -- stolen, in most cases. Also the time and effort spent by those developing the system and loading it with said data.
At execution time. This is the classic example, i.e. the algorithm spits out into your face something that is patently absurd. We all point and laugh, and a screen shot gets posted to Lemmy. "Look, Google says you should put glue on your pizza!" Etc.
Lurking horrors. You find out about the problem later. Much later. After the piece went to print, or the code went into production. "Time and effort were saved," producing the article or writing the code. Yes, they appeared to be -- then. Now it's now. Significant expenditure must be made cleaning up the mess. Nobody actually understood the code but now it has to be debugged. And somebody has to pay the lawyers.

[-] Blue_Morpho@lemmy.world 14 points 1 year ago

Your statement is technically true but wrong in practice. Because your statement applies to EVERYTHING on the Internet. We had tons of error ridden garbage articles written by underpaid interns long before AI.

And no, fact checking is quicker than writing something from scratch. Just like verifying Wikipedia sources is quicker than writing a Wikipedia article.

[-] rottingleaf@lemmy.zip 5 points 1 year ago

And no, fact checking is quicker than writing something from scratch. Just like verifying Wikipedia sources is quicker than writing a Wikipedia article.

For something created by a human - yes. For something created by a text generator - hell no.

[-] admin@lemmy.my-box.dev 3 points 1 year ago

Can you elaborate on that?

[-] iarigby@lemmy.world 3 points 1 year ago

for example in the code, sometimes machine errors are much harder to detect or diagnose because it is nothing like what a human would do. I would expect similarly in text, everything looks correct, because that’s what it is designed to do. Except in code you have a much higher chance of quickly knowing that there is an error somewhere, and with text you don’t even get a warning that you need to start looking for errors

[-] raspberriesareyummy@lemmy.world 11 points 1 year ago

A-MEN. well put. I wouldn't make so many words, I'd just settle for "Fuck LLMs and fuck the dipshits who label it AI or think it has anything to do with AI."

[-] yokonzo@lemmy.world 10 points 1 year ago* (last edited 1 year ago)

Okay, yes I agree with you fully, but you can't just say it's a mathematical law without proof, that's something you need to back up with numbers and I don't think "work" is quantifiable.

Again, yes, they need to slow down, but I have an issue with your claim unless you're going to be backing it up. Otherwise you're just a crazy dude standing on a soapbox

[-] henfredemars@infosec.pub 10 points 1 year ago* (last edited 1 year ago)

The cost however is not the same. I can totally see the occasional lawsuit as the cost of doing business for a company that employs AI.

[-] dual_sport_dork@lemmy.world 23 points 1 year ago

This is almost certainly what we're looking at here. It's the Ford Pinto for the modern age. "So what if a few people get blown up/defamed? Paying for that will cost less than what we made, so we're still in the black." Yeah, that's grand.

Further, generative "AI's" and language models like these are fine when used for noncritical purposes where the veracity of the output is not a requirement. Dall-E is an excellent example, where all it's doing is making varying levels of abstract art and provided nobody is stupid enough to take what it spits out for an actual photograph documenting evidence of something, it doesn't matter. Or, "Write me a poem about crows." Who cares if it might file crows in the wrong taxonomy as long as the poem sounds nice.

Facts and LLM's don't mix, though.

[-] anton@lemmy.blahaj.zone 6 points 1 year ago* (last edited 1 year ago)

While that works for "news agencies" it's a free money glitch when used in a customer support role for the consumer.

Edit: clarification

[-] henfredemars@infosec.pub 19 points 1 year ago

Pretty sure an airline was forced to pay out on a fake policy that one of their support bots spouted.

[-] cheese_greater@lemmy.world 7 points 1 year ago* (last edited 1 year ago)

I can see how it might be seen as more facile to correct/critique than to produce the original work. This is actually true, same as how its easier to iterate on something than to wholesale create the thing.

Definitely find it easier to extend or elaborate on something "old" over crapping out a new thing, altho I can see how that is not always the case if its too "legacy". ChatGPT is intriguing because it can arguably modularly generate many of the parts, you would just need to glue them together properly and ensure all the outputs are cohesive and coherent

For example: if you're a lawyer and you generate anything, you must at the very least

Read, not dictate
Ensure all caselaw cited a) definitely exists and b) is relevant to the facts and arguments they are being used to support

[-] NeptuneOrbit@lemmy.world 4 points 1 year ago

I think it's worse than that. The work is about the same. The skill and pay for that work? Lower.

Why pay 10 experienced journalists when you can pay 10 expendable fact checkers who just need to run some facts/numbers by a Wikipedia page?

[-] artichokecustard@lemmy.world 3 points 1 year ago

the other big thing is that once it does start spouting bullshit or even just finds a phase or string of words, its so hard to get it out, you really just have to start over your instance or purge the memory, they get the obsession so easily sometimes without like sacrificing relevancy to the topic entirely

[-] GBU_28@lemm.ee 3 points 1 year ago

Llms are useful for recalling from a fixed corpus where you dictate they cite their source.

They are ideal for human in the loop research solutions.

The whole "answer anything about anything" concept is dumb.

[-] rottingleaf@lemmy.zip 3 points 1 year ago

Sure, very sophisticated LLM’s might get it right some of the time, or even a lot of the time in the cases of very specific topics with very good training data. But its accuracy cannot be guaranteed unless you fact-check 100% of its output.

You will only guarantee what you answer for.

Since they have power to make it so, they own the good part and disown the bad part.

It's the warfare logic, the collateral damage of FAB-1500 is high, but it makes even imps in the hell tremble when dropped.

And to be treated more gently you need a different power balance. Either make them answer to you, or cut them out. You can't cut out a bombardment, though, and with the TRON project in Japan MS specifically have already shown that they are willing and able to use lobbying to force themselves onto you.

Of course, the end goal of these schemes is to be able to fire as much of the human staff as possible, so it ultimately winds up that there is nobody left to actually do the review. And whatever emaciated remains of management are left don’t actually understand how the machine works nor how its output is generated.

Reminiscent of the Soviet "they imitate pay, we imitate work" thing. Or medieval kings with reducing the metal percentages in coins. The modern Web is not very transparent, and the income is ad-driven, so it's not immediately visible how generated bullshit isn't worth nearly as much as something written by a human.

What I'm trying to say is that the way it's interconnected and amortized this Web is going down as a whole, and not just people poisoning it for short-term income.

This is intentional, they don't want to go down alone, and when more insularity exists, such people go down and others don't. Thus they've almost managed to kill that insularity. This will still work the same good old evolutionary way, just slower.

[-] dependencyinjection@discuss.tchncs.de -1 points 1 year ago

Simply false in my experience.

We use CoPilot at work and there is no babysitting required.

We are software developers / engineers and it’s saves countless hours writing boilerplate code, giving code blocks based on a comment, and sticking to our coding conventions.

Sure it isn’t 100% right, but the owner and lead engineer calculates it to be around 70% accurate and even if it misses the mark, we have a whole lot less key presses to make.

[-] HauntedCupcake@lemmy.world 17 points 1 year ago

Using Copilot as a copilot, like generating boilerplate and then code reviewing it is still "babysitting" it. It's still significantly less effort than just doing it yourself though

[-] FarceOfWill@infosec.pub 10 points 1 year ago

Until someone uses it for a little more than boilerplate, and the reviewer nods that bit through as it's hard to review and not something a human/the person who "wrote" it would get wrong.

Unless all the ai generated code is explicitly marked as ai generated this approach will go wrong eventually.

[-] admin@lemmy.my-box.dev 6 points 1 year ago

Unless all the ai generated code is explicitly marked as ai generated this approach will go wrong eventually.

Undoubtedly. Hell, even when you do mark it as such, this will happen. Because bugs created by humans also get deployed.

Basically what you're saying is that code review is not a guarantee against shipping bugs.

[-] HauntedCupcake@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Agreed, using LLMs for code requires you to be an experienced dev who can understand what it pukes out. And for those very specific and disciplined people it's a net positive.

However, generally, I agree it's more risk than it's worth

[-] dave@feddit.uk 1 points 1 year ago

Surely boilerplate code is copy / paste or macros, then edit the significant bits—a lot less costly than copilot.

[-] dependencyinjection@discuss.tchncs.de 5 points 1 year ago

That would still make more effort.

So, for an example we use a hook called useChanges() for tracking changes to a model in the client, it has a very standard set of arguments.

Why would we want to waste time writing it out all the time when we can write the usual comment “Product Model” and have it do the work.

Copy and Paste takes more effort as we WILL have to change the dynamic parts every time, macros will take longer as we have to create the macros for every different convention we have.

If you can’t see the benefit of LLMs as a TOOL to aid developers then I would hazard a guess you are not in the industry or you just haven’t even given them a go.

I will say I am a new developer and not amazing, but my boss the owner and lead engineer is a certified genius, who will write flawless code on damn teams to help me along at times, and if he can benefit from it in time saved then anybody would.

[-] dave@feddit.uk 2 points 1 year ago

My PhD was in neural networks in the 1990s and I’ve been in development since then.

Remember when digital cameras came out? They were pretty crappy compared to film—if you had a decent film camera and knew what you were doing. I fell like that’s where we’re at with LLMs right now.

Digital cameras are now pretty much on par with film, perhaps better in some circumstances and worse in others.

Shifting gear from writing code to reviewing someone else’s is inefficient. With a good editor setup and plenty of screen real estate, I’m more productive just writing than constantly worrying about what the copilot just inserted. And yes, I’ve tested that.

[-] dependencyinjection@discuss.tchncs.de 1 points 1 year ago

Clearly what works for our company ain’t what would work for you, even if I think it’s preposterous what you’re claiming.

My boss was working on Open Source from the BSD days and is capable of very low level programming. He has forgotten more than I’ll ever know, and if he can find LLMs a useful tool for our literal company to improve productivity then I’m inclined to stick with what I have seen and experienced. Just not having to do and search documentation alone is a massive time saver. Unless obviously you know everything, which nobody does.

[-] echodot@feddit.uk 1 points 1 year ago* (last edited 1 year ago)

How is it more effort to automate boilerplate code? Seriously the worst part of being a programmer is writing the same line of code all of the time. Especially when you know that it won't actually cause anything interesting to happen on the screen it's just background stuff that needs to happen.

When I used to develop websites I don't think I could have lived without Emmett, which was basically the predecessor to co-pilot.

[-] HauntedCupcake@lemmy.world 0 points 1 year ago

Well you have to actually setup the boilerplate, plus copilot is generally more intelligent and context aware, especially for small snippets when you're already coding

[-] Olap@lemmy.world 5 points 1 year ago

What if I told you that typing in software engineering encompasses less than 5% of your day?

[-] dependencyinjection@discuss.tchncs.de 3 points 1 year ago

I’m a developer and typing encompasses most of my day. The owner and lead engineer has many meeting and admin work, but still is writing code and scaffolding new projects around 30% of his time.

[-] dual_sport_dork@lemmy.world 3 points 1 year ago

I'm a developer and typing encompasses most of my day as well, but increasingly less of it is actually producing code. Ever more of it is in the form of emails, typically in the process of being forced to argue with idiots about what is and isn't feasible/in the spec/physically possible, or explaining the same things repeatedly to the types of people who should not be entrusted with a mouse.

[-] null@slrpnk.net -2 points 1 year ago* (last edited 1 year ago)

Always, always, always. This is a mathematical law.

Total bullshit. We use LLMs at work for tasks that would be nearly impossible and require obscene amounts of manpower to do by hand.

Yes we have to check the output, but its not even close to the amount of work to do it by hand. Like, by orders of magnitude.

[-] balder1991@lemmy.world 4 points 1 year ago* (last edited 1 year ago)

Yeah. I’m not sure that statement applies. It’s easier for humans to check something than to come up with something in the first place. But the thing is, the person doing the checking also needs to be proficient in the subject.

load more comments (5 replies)

[-] ocassionallyaduck@lemmy.world 82 points 1 year ago

I hope he wins, and the fine makes Microsoft's eyes water. Everyone need to slow the fuck down with this, and they won't until there are real painful consequences.

MS can drop billions on game company acquisitions like it's no big deal? Cool, give this guy 1 billion dollars for randomly singling him out and automated-accusing him of sex crimes.

Maybe then all the tech bros might pause for 3 seconds before they keep feeding shit into their models illegally.

[-] Boozilla@lemmy.world 45 points 1 year ago

This US election was going to be a no-good-choices shitshow no matter what. But I really dread the AI-amped shitshow we're gonna get.

[-] datavoid@lemmy.ml 13 points 1 year ago

Personally I think it will be quite entertaining.

That being said, I'm not american

[-] Boozilla@lemmy.world 15 points 1 year ago

It's funny from the outside for sure. Until some crazy old creep tries to put in a launch code. Hopefully, they just give them a fake button to mash. Seems like the smart thing to do.

[-] neo@lemy.lol 4 points 1 year ago

While it is indeed entertaining to have a great view on the iceberg crashing the ship, I'm quite worried about the consequences.

Especially since many other decks (countries) already have severe problems.

[-] datavoid@lemmy.ml 1 points 1 year ago

That's fair for sure. In my mind if we made it through the first 4 years, hopefully we will survive the next 4 too.

[-] dkc@lemmy.world 32 points 1 year ago

Good

[-] autotldr@lemmings.world 9 points 1 year ago

This is the best summary I could come up with:

Worse yet, the erroneous reporting was scooped up by MSN — the somehow not-dead-yet Microsoft site that aggregates news — and was featured on its homepage for several hours before being taken down.

It's an unfortunate example of the tangible harms that arise when AI tools implicate real people in bad information as they confidently — and convincingly — weave together fact and fiction.

And if Bigfoot conspiracies slip through MSN's very large and automated cracks, it's not surprising that a real-enough-looking AI-generated article like "Prominent Irish broadcaster faces trial over alleged sexual misconduct" made it onto the site's homepage.

According to the NYT, the website was founded by an alleged abuser and tech entrepreneur named Gurbaksh Chahal, who billed BNN as "a revolution in the journalism industry."

Underpaid employees were asked to feed published articles from other news services into generative AI tools and spit out paraphrased versions.

Eventually, per the NYT, the website's AI tools randomly started assigning employees' names to AI-generated articles they never touched.

The original article contains 559 words, the summary contains 167 words. Saved 70%. I'm a bot and I'm open source!

[-] VeryVito@lemmy.ml 29 points 1 year ago

And now I’m reading a computer’s version of a story describing how a computer wrote a story that should have been discarded.

[-] Bluetreefrog@lemmy.world 17 points 1 year ago

It's even better than that. It's a computer's version of a story describing how a computer wrote a story which was then front-paged by a computer.

[-] leftzero@lemmynsfw.com 9 points 1 year ago

Read the room, bot. 🤦‍♂️

load more comments

this post was submitted on 07 Jun 2024

557 points (99.3% liked)

Technology

74135 readers

1938 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws