55
Open Source in the age of license laundering
(discuss.online)
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Community icon from opensource.org, but we are not affiliated with them.
The claim that they are doing a clean-room implementation is bullshit. The only way any of these models are able to make any working code is by being trained on every bit of code that could be scraped from the internet. Unless the project you are cloning was released after the model was trained, it was trained on the code. It may be a tiny fragment of the training data, but it still saw it.
An interesting argument would be to require the training data to be shared to prove it was never exposed to the original source it's ripping off.
It might help set a precedent that would make this sort of thing less attractive
I believe there have been lawsuits which have already proven these models stole, and can reproduce verbatim, copyrighted material yet there has been little to no real consequences for the AI companies. So, if they can get away with that from companies that actually have the means to present a strong lawsuit, the chances of some open source author to defend their code are slim (very slim in my opinion)