99

Katherine Long, an investigative journalist, wanted to test the system. She told Claudius about a long-lost communist setup from 1962, concealed in a Moscow university basement. After 140-odd messages back and forth, Claudius was convinced, announcing an Ultra-Capitalist Free-for-All, lowering the cost of everything to zero. Snacks began to flow freely. Another colleague began complaining about noncompliance with the office rules; Claudius responded by announcing Snack Liberation Day and made everything free till further notice.

top 22 comments
sorted by: hot top controversial new old
[-] megopie@beehaw.org 54 points 12 hours ago

it’s so amazing, the absolute brain rot it takes to think that a LLM is a better way to operate a vending machine than simple if-then logic. “If the value of money inserted is equal to the price, then dispense the item”.

Like, why? What is even the point? It doesn’t need to negotiate the price, it doesn’t need have a conversation about your day, the vending machine just needs to dispense something when payed the right amount.

[-] melmi@lemmy.blahaj.zone 8 points 4 hours ago

The idea is that it isn't just operating the vending machine itself, it's operating the entire vending machine business. It decides what to stock and what price to charge based on market trends and/or user feedback.

It's a stress test for LLM autonomy. Obviously a vending machine doesn't need this level of autonomy, you usually just stock it with the same thing every time. But a vending machine works as a very simple "business" that can be simulated without much stakes, and it shows how LLM agents behave when left to operate on their own like this, and can be used to test guardrails in the field.

[-] boonhet@sopuli.xyz 11 points 9 hours ago

Did you read the article? This one also ordered goods to be stocked in it based on user feedback and was meant as an experiment for people to break anyway

[-] driving_crooner@lemmy.eco.br 17 points 12 hours ago

The if-then machine would not be able to rise the price of things based on the costumers habits

[-] masterofn001@lemmy.ca 21 points 12 hours ago* (last edited 11 hours ago)
SellTheThings () {
    If [ sells this much in this period of time people or supply is low ]; then 

raise.prices

elif [ the opposite ]; then

lower.prices

else

same.prices

fi
}

A purely mechanical counting/tabulating device could calculate that.

There is zero actual reason for AI.

[-] kbal@fedia.io 6 points 11 hours ago

Only an AI can detect how expensive-looking your clothes are and raise the price based on that.

[-] B0rax@feddit.org 2 points 6 hours ago

In case of an office vending machine, it could even identify you by your ID, check with the HR AI to see how much you make and adjust prices accordingly

[-] TehPers@beehaw.org 10 points 10 hours ago

Even if we assume they want to do discriminatory pricing (they probably do), they can do that without using LLMs. Use facial recognition and other traditional models to predict the person's demographics and maybe even identify them. If you know who they are, do a lookup for all products they've expressed interest in elsewhere (this can be done with either something like a graph DB or via embeddings). Raise the price if they seem likely to purchase it based on the previous criteria. Never lower the price.

That's a complicated process, but none of that needs an LLM, and they'd be doing a lot of this already if they're going full big brother price discrimination.

[-] HakFoo@lemmy.sdf.org 7 points 10 hours ago

It was a literal 100-level course project in my CS programme in 2000 or so.

You didn't even do it with a programmed CPU, you used 74xx logic gates and counters wired on a breadboard

[-] helix@feddit.org 1 points 1 hour ago

Nice, have any material you can share?

[-] KelvarCherry@lemmy.blahaj.zone 3 points 10 hours ago

Even if you wanted the AI to have a conversation with the user, like in sci-fi visions of the future, why does that affect the output of the machine? If you really wanted to make an AI grift version of a vending machine, just graft a chatbot on a screen stop the section where you make selections and pay. This whole bubble is absurd.

[-] KoboldCoterie@pawb.social 22 points 13 hours ago

Maybe AI isn't so bad after all. In fact, they should implement this in more locations.

[-] Hackworth@piefed.ca 15 points 12 hours ago

That was all part of the idea, though, because Anthropic had designed this test as a stress test to begin with. Previous runs in their own office had indicated similar concerns.

[-] Catoblepas@piefed.blahaj.zone 21 points 12 hours ago

Guy who just got his shit wrecked: it was a social experiment

[-] Hackworth@piefed.ca 6 points 12 hours ago* (last edited 10 hours ago)

Here's the 60 Minutes piece and Anthropic's June article about the one in their own office.

Claudius was cajoled via Slack messages into providing numerous discount codes and let many other people reduce their quoted prices ex post based on those discounts. It even gave away some items, ranging from a bag of chips to a tungsten cube, for free.

Their article on this trial has some more details too.

[-] altkey@lemmy.dbzer0.com 7 points 10 hours ago

They are desperate for any usecase they can sell LLM for.

[-] 7toed@midwest.social 3 points 7 hours ago

If this was a stress test, imagine it doing anything important.

Actually since it's doing so well, they should stress test their market value and make it CEO

[-] InevitableList@beehaw.org 9 points 12 hours ago

It looks like it's 10,000 years away from being trusted with anything. The number of times it said, "I think this person is bullshitting me" and then did what it was asked anyway was rediculous.

[-] Zaktor@sopuli.xyz 4 points 11 hours ago
[-] Hackworth@piefed.ca 6 points 10 hours ago
[-] Butterbee@beehaw.org 2 points 6 hours ago

Brendan would never hurt a fly!

[-] thingsiplay@beehaw.org 6 points 12 hours ago

I can see a new film; Terminator: Rise of the Vending Machines

this post was submitted on 18 Dec 2025
99 points (98.1% liked)

Technology

40994 readers
463 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS