After yet another bot scraping wave forcing me to do sysadmin work at 3am, and me ranting about it on lemmy, @self@awful.systems linked me to a post that referenced iocaine which sounded a perfect way to get back on bots that don't respect our resources and time.
At the same time, we recently on-boarded @tenchiken@lemmy.dbzer0.com as an extra sysadmin to reduce the "bus factor" of our instance (say hello), and they graciously offered some spare compute they had lying around. So I thought, since serving iocaine to bots doesn't really require any serious uptime, why not put it those resources to good use.
So after a couple of hours messing up with things, I've now deployed iocaine to protect our instance as well as fediseer. This should hopefully start messing back with these bastards by serving them some surrealistic nonsense I had squirreled away.
If you want to see this in action, set your user agent to GPTBot
and visit our instance. If you find yourself trapped in iocaine somehow, just let us know.
Thanks. Iocaine doesn't do detection, it only does the poisoning. The detection is currently manual. We do it based on agents and ip ranges. These bots are extraordinarily stupid atm, which is what is the biggest issue. The ones causing us down times were hitting obsolete domains and stupid links constantly. They are very very crude. They are not sophisticated yet to check DOM but they can tell when they've been blocked and switch to proxies. Sending them to iocaine is meant to not let them realize they're blocked.
Obviously someone smart can easily defeat it, even by just respecting our resources. But these fuckers are very greedy atm. We'll have to evolve along with them.