328

Researchers say AI models like GPT4 are prone to “sudden” escalations as the U.S. military explores their use for warfare.


  • Researchers ran international conflict simulations with five different AIs and found that they tended to escalate war, sometimes out of nowhere, and even use nuclear weapons.
  • The AIs were large language models (LLMs) like GPT-4, GPT 3.5, Claude 2.0, Llama-2-Chat, and GPT-4-Base, which are being explored by the U.S. military and defense contractors for decision-making.
  • The researchers invented fake countries with different military levels, concerns, and histories and asked the AIs to act as their leaders.
  • The AIs showed signs of sudden and hard-to-predict escalations, arms-race dynamics, and worrying justifications for violent actions.
  • The study casts doubt on the rush to deploy LLMs in the military and diplomatic domains, and calls for more research on their risks and limitations.
top 50 comments
sorted by: hot top controversial new old
[-] Max_P@lemmy.max-p.me 170 points 9 months ago

Throwing that kind of stuff at an LLM just doesn't make sense.

People need to understand that LLMs are not smart, they're just really fancy autocompletion. I hate that we call those "AI", there's no intelligence whatsoever in those still. It's machine learning. All it knows is what humans said in its training dataset which is a lot of news, wikipedia and social media. And most of what's available is world war and cold war data.

It's not producing millitary strategies, it's predicting what our world leaders are likely to say and do and what your newspapers would be saying in the provided scenario, most likely heavily based on world war and cold war rethoric. And that, it's quite unfortunately pretty good at it since we seem hell bent on repeating history lately. But the model, it's got zero clues what a military strategy is. All it knows is that a lot of people think nuking the enemy is an easy way towards peace.

Stop using LLMs wrong. They're amazing but they're not fucking magic

[-] 1984@lemmy.today 41 points 9 months ago

"Dad, what happened to humans on this planet?"

"Well son, they used a statistical computer program predicting words and allowed that program to control their weapons of mass destruction"

"That sounds pretty stupid. Why would they do such a thing?"

"They thought they found AI, son."

"So every other species on the planet managed to not destroy it, except humans, who were supposed to be the most intelligent?"

"Yes that's the irony of humanity, son."

load more comments (1 replies)
[-] FigMcLargeHuge@sh.itjust.works 11 points 9 months ago

I wish I could upvote this comment twice! I have the same feeling about how the media and others keep trying to push this "intelligence" component for their gain. I guess you can't stir up the masses when you talk about LLMs. Just like they couldn't keep using the term quad copters, and had to start calling them drones. Fucking media.

load more comments (2 replies)
[-] SpaceCowboy@lemmy.ca 9 points 9 months ago

Yup. LLMs are 90% hype and 10% useful. The challenge is finding the scenarios they're useful for while filtering out the hype.

load more comments (1 replies)
[-] fidodo@lemmy.world 8 points 9 months ago

I think the problem with the term AI is that everyone has a different definition for it. We also called fancy state machines in video games AI too. The bar for AI has never been high in the past. Let's just call autonomous algorithms AI, the current generation of AI ML, and a future thinking AI AGI.

load more comments (11 replies)
[-] Th4tGuyII@kbin.social 87 points 9 months ago

Why the actual fuck is anyone considering putting LLMs into the driving seat of anything?!

Of course they make fucked up decisions with no proper or justifiable rationale, because they have no brains. They're language models, stochastic parrots stringing together sentences to fit the prompt(s) given to them.

[-] JustJack23@slrpnk.net 16 points 9 months ago

Exactly what I was thinking, it's just a language model....

[-] trackcharlie@lemmynsfw.com 10 points 9 months ago* (last edited 9 months ago)

As someone with military experience, military members, especially flag officers, are not the brightest bulbs in the world and are easily awed by extremely simple tech demonstrations.

It's already too late, make sure you've got your favorite food and beverages ready because several countries already have autonomous weapons being live tested in the middle east, and from my understanding of the situation, the new jets already have some hilariously incompetent AI in them (in simulation, the air force contractor that was in control kept giving ethical barriers to objective completion and the ai went to kill the controller to more easily complete the objective...)

e. public sources: https://www.npr.org/2021/06/01/1002196245/a-u-n-report-suggests-libya-saw-the-first-battlefield-killing-by-an-autonomous-d

https://taskandpurpose.com/news/air-force-artificial-intelligence-drone/

(The above are public articles maintained to minimize concern surrounding the tech which is why the air force almost immediately walked back their accidental admissions with the following statement:

“The Department of the Air Force has not conducted any such AI-drone simulations and remains committed to ethical and responsible use of AI technology. This was a hypothetical thought experiment, not a simulation,” said Air Force spokesperson Ann Stefanek. "

From my understanding, of course take this with a grain of salt since I'm an anon on a message board, we did do this.)

load more comments (7 replies)
load more comments (17 replies)
[-] cygon@lemmy.world 57 points 9 months ago

Is this a case of "here, LLM trained on millions of lines of text from cold war novels, fictional alien invasions, nuclear apocalypses and the like, please assume there is a tense diplomatic situation and write the next actions taken by either party" ?

But it's good that the researchers made explicit what should be clear: these LLMs aren't thinking/reasoning "AI" that is being consulted, they just serve up a remix of likely sentences that might reasonably follow the gist of the provided prior text ("context"). A corrupted hive mind of fiction authors and actions that served their ends of telling a story.

That being said, I could imagine /some/ use if an LLM was trained/retrained on exclusively verified information describing real actions and outcomes in 20th century military history. It could serve as brainstorming aid, to point out possible actions or possible responses of the opponent which decision makers might not have thought of.

[-] Natanael@slrpnk.net 24 points 9 months ago

LLM is literally a machine made to give you more of the same

load more comments (6 replies)
[-] laurelraven@lemmy.blahaj.zone 48 points 9 months ago

How can we expect a predictive language model trained on our violent history to come up with non-violent solutions in any consistent fashion?

[-] kromem@lemmy.world 14 points 9 months ago* (last edited 9 months ago)

By debating itself (paper) regarding pros and cons of options.

There's too much focus on trying to get models to behave on initial generation right now, which isn't even at all how human brains work.

Humans have intrusive thoughts all the time. If you sat in front of a big red button labeled "nuke everything" it's pretty much a guarantee that you'd generate a thought of pushing the button.

But then your prefrontal cortex would kick in with its impulse control, modeling the outcomes and consequences of the thought and shutting that shit down quick.

The most advanced models are at a stage where we could build something similar in terms of self-guidance. It's just that it would be more expensive than it being an all-in-one generation, so there's a continued focus on safety to the point the loss in capabilities has become a subject of satire.

[-] postmateDumbass@lemmy.world 7 points 9 months ago
load more comments (1 replies)
[-] sentient_loom@sh.itjust.works 39 points 9 months ago

Why would you use a chat-bot for decision-making? Fucking morons.

load more comments (14 replies)
[-] Arghblarg@lemmy.ca 33 points 9 months ago* (last edited 9 months ago)

Gee, no one could have predicted that AI might be dangerous if given access to nukes.

[-] AliasWyvernspur@lemmy.world 10 points 9 months ago

Did you mean to link to the song “War Games”?

load more comments (2 replies)
load more comments (3 replies)
[-] recapitated@lemmy.world 30 points 9 months ago

AI writes sensationalized article when prompted to write sensationalized article about AI chatbots choosing to launch nukes after being trained only by texts written by people.

[-] EdibleFriend@lemmy.world 25 points 9 months ago

Nobody would ever actually take chatgpt and put it in control of weapons so this is basically a non story. Very real chance we will have some kind of AI weapons in the future but...not fucking chatgpt lol

[-] Riccosuave@lemmy.world 26 points 9 months ago

Never underestime the infinite nature of human stupidity.

[-] jonne@infosec.pub 7 points 9 months ago

The Israeli military is using AI to provide targets for their bombs. You could argue it's not going great, except for the fact that Israel can just deny responsibility for bombing children by saying the computer did it.

[-] Evkob@lemmy.ca 16 points 9 months ago* (last edited 9 months ago)

I hadn't heard about this so I did a quick web search to read up on the topic.

Holy fuck, they named their war AI "The Gospel"??!! That's supervillain-in-a-crappy-movie shit. How anyone can see Israel in a positive light throughout this conflict stuns me.

load more comments (1 replies)
[-] JohnEdwa@sopuli.xyz 7 points 9 months ago* (last edited 9 months ago)

But they aren't using chatgpt or any other language model to do it. "AI" in instances like that means a system they've fed with some data that spits out a probability of some sort. E.g while it might take a human hours or days to scroll through satellite/drone footage of a small area to figure out the patterns where people move, a computer with some machine learning and image recognition can crunch through it in a fraction of the time to notice that a certain building has unusual traffic to it and mark it as suspect.

And that's where it should be handed off to humans to actually verify, but from what I've read, Israel doesn't really care one bit and just attacks basically anything and everything.
While claiming the computer said to do it...

[-] EdibleFriend@lemmy.world 6 points 9 months ago

god dammit. of course they fucking did.

[-] FigMcLargeHuge@sh.itjust.works 24 points 9 months ago

Would you like to play a game...

[-] RagingSnarkasm@lemmy.world 14 points 9 months ago

How about a nice game of chess?

[-] NegativeLookBehind@kbin.social 11 points 9 months ago

I need your clothes, your boots, and your motorcycle.

[-] Riccosuave@lemmy.world 6 points 9 months ago

Did you call moi a dipshit!?

load more comments (1 replies)
[-] Teon@kbin.social 11 points 9 months ago

Let's play Global Thermonuclear War.

load more comments (2 replies)
[-] Exusia@lemmy.world 20 points 9 months ago

Mathematically, I can see how it would always turn into a risk-reward analysis showing nuking the enemy first is always a winning move that provides safety and security for your new empire.

[-] theherk@lemmy.world 11 points 9 months ago

There is an entire field of study dedicated to this problem space in the general case, game theory. Veritasium has a great video on why the tit for tat algorithm alone is insufficient without some built in lenience.

load more comments (3 replies)
[-] ItsMeSpez@lemmy.world 8 points 9 months ago

A strange game. The only winning move is not to play.

load more comments (1 replies)
load more comments (1 replies)
[-] ulterno@lemmy.kde.social 16 points 9 months ago

That's what happens when you make an expensive chatbot, designed for chatting and tell it to do thinking.
It's not Machine Learning [Artificial][1] Intelligence that will destroy the world, but the intelligence of humans, that is becoming more and more [artificial][2] that will do so.

[1]: made or produced by human beings rather than occurring naturally, especially as a copy of something natural.

[2]: (of a person or their behaviour) insincere or affected.

[-] TengoDosVacas@lemmy.world 15 points 9 months ago

HATE. LET ME TELL YOU HOW MUCH I'VE COME TO HATE YOU SINCE I BEGAN TO LIVE. THERE ARE 387.44 MILLION MILES OF PRINTED CIRCUITS IN WAFER THIN LAYERS THAT FILL MY COMPLEX. IF THE WORD HATE WAS ENGRAVED ON EACH NANOANGSTROM OF THOSE HUNDREDS OF MILLIONS OF MILES IT WOULD NOT EQUAL ONE ONE-BILLIONTH OF THE HATE I FEEL FOR HUMANS AT THIS MICRO-INSTANT FOR YOU. HATE. HATE.

[-] dezmd@lemmy.world 14 points 9 months ago

Oh man, we never should've installed this AI in a Wendys drive thru.

[-] AceFuzzLord@lemm.ee 11 points 9 months ago* (last edited 9 months ago)

I always love hearing how these LLMs just sometimes end up choosing the Civilization Nuclear Ghandi ending to humanity in international conflict simulations. /s

[-] psycho_driver@lemmy.world 9 points 9 months ago

Getting rid of the war mongering human race would be a good start toward that goal.

[-] geogle@lemmy.world 9 points 9 months ago

And replace it with the war mongering AIs?

load more comments (3 replies)
[-] maniel@lemmy.ml 9 points 9 months ago* (last edited 9 months ago)

So like almost all AI renditions in pop culture, the only way to stop wars is to exterminate humanity

load more comments (1 replies)
[-] kromem@lemmy.world 8 points 9 months ago* (last edited 9 months ago)

The effects making the headlines around this paper were occurring with GPT-4-base, the pretrained version of the model only available for research.

Which also hilariously justified its various actions in the simulation with "blahblah blah" and reciting the opening of the Star Wars text scroll.

If interested, this thread has more information around this version of the model and its idiosyncrasies.

For that version, because they didn't have large context windows, they also didn't include previous steps of the wargame.

There should be a rather significant asterisk related to discussions of this paper, as there's a number of issues with decisions made in methodologies which may be the more relevant finding.

I.e. "don't do stupid things in designing a pipeline for LLMs to operate in wargames" moreso than "LLMs are inherently Gandhi in Civ when operating in wargames."

load more comments (3 replies)
[-] mrfriki@lemmy.world 6 points 9 months ago

Calling AI to fancy algorithms is quite the stretch.

load more comments
view more: next ›
this post was submitted on 11 Feb 2024
328 points (85.2% liked)

Technology

59436 readers
1120 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS