So cool that you have to booby trap sites and insert delays and stuff because a handful of rich psychopaths refuse to respect robots.txt
/me waves in awkward sysadmin fashion
Looking forward to nonsense gibberish on the Internet being put to good use instead of politics!
Welcome aboard!
๐
Hi, I really hope iocaine
works for you and I think it still might be wise to temper expectations. Some background, I work in bot detection and mitigation.
I quickly tried reading through their code and documentation but I don't see the main detection mechanism that determines human vs bot other than what you mentioned as an example. If it's user agent based, it is trivially easy to spoof as you already know. I am finding in my work that these companies do not keep the user agent they report in their documentation when challenged.
My second concern was the page the reverse proxy served when spoofing my user agent. The DOM was nowhere close to that of Lemmy and I think it's important to point out that a simple check for specific elements on the page will keep the bot from poisoning itself.
I admit I could be too close to this problem to see other solutions, and I really hope it works. It sucks that this is a problem. I wish there were more open source options too.
If for some reason this solution doesn't work, and if anyone is interested in help, I am more than happy to freely offer my knowledge.
Thanks. Iocaine doesn't do detection, it only does the poisoning. The detection is currently manual. We do it based on agents and ip ranges. These bots are extraordinarily stupid atm, which is what is the biggest issue. The ones causing us down times were hitting obsolete domains and stupid links constantly. They are very very crude. They are not sophisticated yet to check DOM but they can tell when they've been blocked and switch to proxies. Sending them to iocaine is meant to not let them realize they're blocked.
Obviously someone smart can easily defeat it, even by just respecting our resources. But these fuckers are very greedy atm. We'll have to evolve along with them.
For what it's worth, this is just damage control and first step. Deployment was trivial compared to most other ideas, so it seemed worth at least giving a go.
Our expectations are very much tempered, but trying to be optimistic on even a small reprieve.
Thanks for the Dom detail!
I feel you and feel for you. I really do hope you get a reprieve because dealing with this is nonsense.
Nice :D i thought of this before but didn't know there was a software for it, it looks great. Glad we're using it, fuck AI crawlers.
Good looking out! Appreciate it!
Also, anything that adds surreal nonsense is something I support, especially using it on bots.
Good job!
(say hello)
Arise, Chicken.
Great!
Just idle curiousity: Any particular reason for iocaine vs. any of the other similar projects (found this list on the iocaine homepage) out there?
Just first I run into and was easy to deploy
/0
Meta community. Discuss about this lemmy instance or lemmy in general.