22
New TokenBreak Attack Bypasses AI Moderation with Single-Character Text Changes
(thehackernews.com)
c/cybersecurity is a community centered on the cybersecurity and information security profession. You can come here to discuss news, post something interesting, or just chat with others.
THE RULES
Instance Rules
Community Rules
If you ask someone to hack your "friends" socials you're just going to get banned so don't do that.
Learn about hacking
Other security-related communities !databreaches@lemmy.zip !netsec@lemmy.world !securitynews@infosec.pub !cybersecurity@infosec.pub !pulse_of_truth@infosec.pub
Notable mention to !cybersecuritymemes@lemmy.world
Anyone allowing an LLM to take direct, tangible change on anything deserves everything they get for being so utterly stupid. This came awfully close.
Parsing user queries and regurgitating publicly available answers (that the user could probably search for themselves) is about the limit of trust, and even then it's sketchy. They're such soft targets and get juicier the more pies they are allowed to have their fingers in.
The case I know of a company wanting to get the "efficiency" of using chatbots instead of people but not the responsibility of one, is Air Canada. They were held responsible in that case of their AI agent's policy hallucinations. Though the customer had to go through many hoops to get to that point and probably others were affected without due recourse.
What a brass neck on them - shocking they couldn't see it and decide to settle quietly instead.
Best thing I've read all day, cheers :)