72
you are viewing a single comment's thread
view the rest of the comments
[-] Michal@programming.dev 6 points 7 hours ago

The answer lies within the article

Publishers legally control content that AI companies desperately want, but AI companies don't always want to negotiate a license. The first-sale doctrine offered a workaround: Once you buy a physical book, you can do what you want with that copy—including destroy it. That meant buying physical books offered a legal workaround.

And yet buying things is expensive, even if it is legal. So like many AI companies before it, Anthropic initially chose the quick and easy path. In the quest for high-quality training data, the court filing states, Anthropic first chose to amass digitized versions of pirated books to avoid what CEO Dario Amodei called "legal/practice/business slog"—the complex licensing negotiations with publishers. But by 2024, Anthropic had become "not so gung ho about" using pirated ebooks "for legal reasons" and needed a safer source.

this post was submitted on 26 Jun 2025
72 points (95.0% liked)

Technology

39328 readers
390 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS