Show transcript
Screenshot of a tumblr post by hbmmaster:
the framing of generative ai as “theft” in popular discourse has really set us back so far like not only should we not consider copyright infringement theft we shouldn’t even consider generative ai copyright infringement
who do you think benefits from redefining “theft” to include “making something indirectly derivative of something created by someone else”? because I can assure you it’s not artists
okay I’m going to mute this post, I’ll just say,
if your gut reaction to this is that you think this is a pro-ai post, that you think “not theft” means “not bad”, I want you to think very carefully about what exactly “theft” is to you and what it is about ai that you consider “stealing”.
do you also consider other derivative works to be “stealing”? (fanfiction, youtube poops, gifsets) if not, why not? what’s the difference? because if the difference is actually just “well it’s fine when a person does it” then you really should try to find a better way to articulate the problems you have with ai than just saying it’s “stealing from artists”.
I dislike ai too, I’m probably on your side. I just want people to stop shooting themselves in the foot by making anti-ai arguments that have broader anti-art implications. I believe in you. you can come up with a better argument than just calling it “theft”.
This is interesting. I agree that stealing isn't the right category. Copyright infringement may be, but there needs to be a more specific question we are exploring.
Is it acceptable to make programmatic transformations of copyrighted source material without the copyright holder's permission for your own work?
Is it acceptable to build a product which contains the copyrighted works of others without their permission? Is it different if the works contained in the product are programmatically transformed prior to distribution?
Should the copyright holders be compensated for this? Is their permission necessary?
The same questions apply to the use of someone's voice or likeness in products or works.
Somebody correct me if I'm wrong, but my understanding of how image generation models and training them works is that the end product, in fact, does not contain any copyrighted material or any transformation of that copyrighted material. The training process refines a set of numbers in the model, But those numbers can't really be considered a transformation of the input.
To preface what I'm about to say, LLMs and image models are absolutely not intelligent, and it's fucking stupid that they're called AI at all. However, if you look at somebody's art and learn from it, you don't contain a copyrighted piece of their work in your head or a transformation of that copyrighted work. You've just refined your internal computers knowledge and understanding of the work, I believe the way image models are trained could be compared to that.
the generated product absolutely contains elements of the things it copied from. imagine the difference between someone making a piece of art that is heavily inspired by someone else's work VS directly tracing the original and passing it off as entirely yours
The magic word here is transformative. If your use of source material is minimal and distinct, that's fair use.
If a 4 GB model contains the billion works it was trained on - it contains four bytes of each.
What the model does can be wildly different from any particular input.
Using peoples work, and math to make predictions is not transformative. Human creations are transformative.
Any transformation is transformative.