847
lads (lemmy.world)
you are viewing a single comment's thread
view the rest of the comments
[-] daniskarma@lemmy.dbzer0.com -4 points 2 days ago* (last edited 1 day ago)

There a small number of AI companies training full LLM models. And they usually do a few trains per years. What most people see as "AI bots" are not actually that.

The influence of AI over the net is another topic. But anubis is also not doing anything about that as it just makes so the AI bots waste more energy getting the data or at most that data under "anubis protection" does not enter the training dataset. The AI will still be there.

Am I in the list of "good bots" ?sometimes I scrap websites for price tracking or change tracking. If I see a website running malware on my end I would most likely just block that site, one legitimate user less.

[-] squaresinger@lemmy.world 4 points 1 day ago

That's outdated info. Yes, not a lot of scraping is really necessary for training. But LLMs are currently often coupled with web search to improve results.

So for example if you ask ChatGPT to find a specific product for you, the result doesn't come from the model. Instead it does a web seach, then it loads the results, summarizes them and returns you the summary plus the links. This is a time-critical operation since the user is waiting for the results. It's also a bad operation for the site being scraped in many situations (mostly when looking for info, not for products) since the user might be satisfied with the summary and won't click the source.

So if you can delay scraping like that by a few seconds, that's quite significant.

I (and A LOT) of lemmings already had enough of AI. We DON'T need AI-everything. So we block/make it harder for ai to be trained. We didn't say "hey, please train your llm on our data" anyways.

[-] daniskarma@lemmy.dbzer0.com 0 points 2 days ago* (last edited 1 day ago)

That's legitimate.

But it's not "open", nor "free".

Also it's a little placebo. For instance Lemmy is not an Anubis usecase. As lemmy can be legitimately scrapped by any agent through the federation system. And I don't really know how would even Anubis work with the openess of the Lemmy API.

this post was submitted on 13 Aug 2025
847 points (98.2% liked)

Programmer Humor

25699 readers
1320 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS