326
The New York Times sues OpenAI and Microsoft for copyright infringement
(edition.cnn.com)
This is a most excellent place for technology news and articles.
Not the original comment but I think the difference you're looking for is in the copying and distribution. The OC makes the false assumption that the data set is full copies of every object fed into it rather than sets of common characteristics.
For example, my own mind has a concept tree. Tree is not a copy of every tree I've ever known but more like lists of common characteristics that define treeness based on information I've gathered about treeness (my data set).
Piracy is piracy not because of how it's consumed, but rather, how it's distributed and stored, as full copies of the object. Datasets are not copies, in other words. And thus copyright doesn't apply.
Reading an article to get an idea about what articleness is, is fair use. Reading an article to reproduce it verbatim is not. And as of now, I don't believe LLMs are doing the later.