488
Someone got Gab's AI chatbot to show its instructions
(mbin.grits.dev)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I mean, I've got one of those "so simple it's stupid" solutions. It's not a pure LLM, but those are probably impossible... Can't have an AI service without a server after all, let alone drivers
Do a string comparison on the prompt, then tell the AI to stop.
And then, do a partial string match with at least x matching characters on the prompt, buffer it x characters, then stop the AI.
Then, put in more than an hour and match a certain amount of prompt chunks across multiple messages, and it's now very difficult to get the intact prompt if you temp ban IPs. Even if they managed to get it, they wouldn't get a convincing screenshot without stitching it together... You could just deny it and avoid embarrassment, because it's annoyingly difficult to repeat
Finally, when you stop the AI, you start printing out passages from the yellow book before quickly refreshing the screen to a blank conversation
Or just flag key words and triggered stops, and have an LLM review the conversation to judge if they were trying to get the prompt, then temp ban them/change the prompt while a human reviews it