Hail Anubis-chan.
any "bot stopper" ends up stopping me somehow. Including anubis. I'm pretty sure ive been cursed by the rng gods because even at 40 KH/s, I get stuck on the pages for like 2 minutes before it tells me success.
Similar things like hcaptcha or cloudflare turnstile either never load or never succeed. Recaptcha gaslights me into thinking I was wrong.
What's this about?
Anubis is a simple anti-scraper defense that weighs a web client's soul by giving it a tiny proof-of-work workload (some calculation that doesn't have an efficient solution, like cryptography) before letting it pass through to the actual website. The workload is insignificant for human users, but very taxing for high-volume scrapers. The calculations are done on the client's side using Javascript code.
(edit) For clarification: this works because the computation workload takes a relatively long time, not because it bogs down the CPU. Halting each request at the gate for only a few seconds adds up very quickly.
Recently, the FSF published an article that likened Anubis to malware because it's basically arbitrary code that the user has no choice but to execute:
[...] The problem is that Anubis makes the website send out a free JavaScript program that acts like malware. A website using Anubis will respond to a request for a webpage with a free JavaScript program and not the page that was requested. If you run the JavaScript program sent through Anubis, it will do some useless computations on random numbers and keep one CPU entirely busy. It could take less than a second or over a minute. When it is done, it sends the computation results back to the website. The website will verify that the useless computation was done by looking at the results and only then give access to the originally requested page.
Here's the article, and here's aussie linux man talking about it.
fwiw Anubis is working on a more respectful update, this was their first pass solution for what was basically a break glass emergency. i understand FSF's concern, but Anubis is the only thing that's making a free and open internet remotely possible right now, and far better it that nightmare fuel like cloudflare
Well, that's a typically abstract, to-the-letter take on the definition of software freedom from them. I think the practical necessity of doing something like this, especially for services like Invidious that are at risk, and the fact it's a harmless nonsense calculation really deserves an exception.
aussie linux man
How did I know exactly who you were talking about before clicking the link?
The outro song played in my head...
But they can still scrape it, it just costs them computation?
Correct. Anubis' goal is to decrease the web traffic that hits the server, not to prevent scraping altogether. I should also clarify that this works because it costs the scrapers time with each request, not because it bogs down the CPU.
Why not then just make it a setTimeout or something so that it doesn't nuke the CPU of old devices?
Crawlers don't have to follow conventions or specifications. If one has a setTimeout
implementation that doesn't wait the specified amount of time and simply executes the callback immediately, it defeats the system. Proof-of-work is meant to ensure that it's impossible to get around the time factor because of computational inefficiency.
Anubis is an emergency solution against the flood of scrapers deployed by massive AI companies. Everybody wishes it wasn't necessary.
Beautiful
I did notbknow FSF is complaining about anubis doesn't a bunch of fsf-ally organizations use it?
The block underneath ai is python
I heard this picture
Programmer Humor
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
Rules
- Keep content in english
- No advertisements
- Posts must be related to programming or programmer topics