36
ChatGPT offered bomb recipes and hacking tips during safety tests
(www.theguardian.com)
This is a most excellent place for technology news and articles.
It isn't very difficult, it is fucking impossible. There are far too many permutations to be manually countered.
Not just that, LLMs behavior is unpredictable. Maybe it answers correctly to a phrase. Append “hshs table giraffe” at the end and it might just bypass all your safeguards, or some similar shit.