Show transcript
Screenshot of a tumblr post by hbmmaster:
the framing of generative ai as “theft” in popular discourse has really set us back so far like not only should we not consider copyright infringement theft we shouldn’t even consider generative ai copyright infringement
who do you think benefits from redefining “theft” to include “making something indirectly derivative of something created by someone else”? because I can assure you it’s not artists
okay I’m going to mute this post, I’ll just say,
if your gut reaction to this is that you think this is a pro-ai post, that you think “not theft” means “not bad”, I want you to think very carefully about what exactly “theft” is to you and what it is about ai that you consider “stealing”.
do you also consider other derivative works to be “stealing”? (fanfiction, youtube poops, gifsets) if not, why not? what’s the difference? because if the difference is actually just “well it’s fine when a person does it” then you really should try to find a better way to articulate the problems you have with ai than just saying it’s “stealing from artists”.
I dislike ai too, I’m probably on your side. I just want people to stop shooting themselves in the foot by making anti-ai arguments that have broader anti-art implications. I believe in you. you can come up with a better argument than just calling it “theft”.
Somebody correct me if I'm wrong, but my understanding of how image generation models and training them works is that the end product, in fact, does not contain any copyrighted material or any transformation of that copyrighted material. The training process refines a set of numbers in the model, But those numbers can't really be considered a transformation of the input.
To preface what I'm about to say, LLMs and image models are absolutely not intelligent, and it's fucking stupid that they're called AI at all. However, if you look at somebody's art and learn from it, you don't contain a copyrighted piece of their work in your head or a transformation of that copyrighted work. You've just refined your internal computers knowledge and understanding of the work, I believe the way image models are trained could be compared to that.
This is generally correct, though diffusion models and GPTs work in totally different ways. Assuming an entity had lawful access to the image in the first place, nothing that persists in a trained diffusion model can be realistically considered to be a copy of any particular training image by anyone who knows wtf they're talking about.
the generated product absolutely contains elements of the things it copied from. imagine the difference between someone making a piece of art that is heavily inspired by someone else's work VS directly tracing the original and passing it off as entirely yours
I understand that's how you think of it, but I'm talking about the technology itself. There is absolutely no copy of the original work, in the sense of ones and zeros.
The image generation model itself does not contain any data at all that is any of the work it was trained on, so the output of the model can't be considered copyrighted work.
Yes, you can train models to copy artists' styles or work, but it's not like tracing the image at all. Your comparison is completely wrong. It is a completely unique image that is generated off of the model itself, because the model itself does not contain any of the original work.