328
Amazon’s First ‘Fallout’ Show Art Is AI Generated
(www.forbes.com)
This is a most excellent place for technology news and articles.
I considered including mention of overfitting in my earlier comment, but since it's such an edge case I felt it would just be an irrelevant digression.
When a particular image has a great many duplicates in the training set - hundreds or even thousands of copies are necessary - then you get the phenomenon of overfitting. In that case you do get this sort of "memorization" of a particular image, because during training you are hitting the neural net over and over with the exact same inputs and really drilling it into them. This is universally considered undesirable, because there's no point to it - why spend thousands of dollars to do something that a copy/paste command could do so much better and more easily? So when image generators are trained the training data goes through a "de-duplication" step intended to try to prevent this sort of thing from happening. Images like the Mona Lisa are so incredibly common that they still slip through the cracks, though.
There's a paper from some months back that commonly comes up when people want to go "aha, generative AI copies its training data!" But in reality this paper shows just how difficult it is to arrange for overfitting to happen. The researchers used an older version of Stable Diffusion whose training set was not well curated and is no longer used due to its poor quality, and even then it took them hundreds of millions of attempts to find just a handful of images from the training set that they could dredge back out of it in recognizable form.