438
Authors Are Furious After Finding Their Works on List of Books Used To Train AI
(www.themarysue.com)
This is a most excellent place for technology news and articles.
If an AI "reproduces" a work it was trained on it is a failure of an AI. Why would anyone want to spend millions of dollars and devote oodles of computing power to build something that just does what a simple copy/paste operation can accomplish?
When an AI spits out something that's too close to one of the original training set that's called "overfitting" and it is considered an error to be corrected. Most overfitting that's been detected has been a result of duplication in the training set - when you hammer an AI image generator in training with thousands of copies of the Mona Lisa it eventually goes "alright, I get it already, when you say 'Mona Lisa' you want that exact pattern!" And will try its best to replicate that pattern when you ask it to later. That's why training sets need to be de-duplicated.
AIs are meant to produce new things.