386
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 08 Aug 2025
386 points (99.7% liked)
Fediverse
21089 readers
636 users here now
A community dedicated to fediverse news and discussion.
Fediverse is a portmanteau of "federation" and "universe".
Getting started on Fediverse;
- What is the fediverse?
- Fediverse Platforms
- How to run your own community
founded 5 years ago
MODERATORS
Is it? The entire point of federation is that you can download all the data from another instance. Facebook is just training AI on the data that they’ve downloaded.
The point they're making is that they don't need to scrape the data. It is available via federation. Scraping the data is less efficient and can negatively affect the platform performance, versus the built in federation system where that data sync is intentional.
Especially when Meta has a fediverse presence. The reason they're scraping is likely because instances have blocked theirs, in part to prevent this exact thing.
Oh, right. I assumed “scraping” wasn’t meant literally. I assumed they were actually using an instance to pull in data (maybe using threads). Then training the AI off the data from their instance. If it is literally scraping, that’s petty dumb.
They could just spin up a no-name instance that isn't associated with them to get it through federation, though. It still doesn't make sense to scrape.
They'd have to host it from somewhere not related to Meta in any way, otherwise someone on the fediverse would find that link and spread the word, and it would be blocked the exact same way. It only takes one person making that connection, Meta knows they're hated.
Mega corps do that all the time. They have shell corporations for the exact purpose of obfuscating their future intentions.
They could stick it in Azure or AWS or something.
Or they could just use their existing scrapers and try to brute force it. Meta isn't exactly known for being sneaky.