784

OpenAI be like (lemmy.zip)

submitted 7 months ago by cm0002@lemmy.zip to c/memes@sopuli.xyz

18 comments fedilink hide all child comments

top 18 comments

sorted by: hot top controversial new old

[-] its_kim_love@lemmy.blahaj.zone 53 points 7 months ago

Rules for thee.

[-] AceFuzzLord@lemmy.zip 38 points 7 months ago

Both companies deserve ruin because all genAI deserves to be wiped off the face of the planet.

[-] mindbleach@sh.itjust.works -2 points 7 months ago

Oh no, statistical modeling about published works allows weird new shit. We must ban this entire class of software because we all care so deeply about copyright.

[-] mindbleach@sh.itjust.works 32 points 7 months ago

Apparently Hunyuan just released some big-ass video model, and it's air-quotes "open source" with a bunch of finger-wag restrictions. One of them is 'you may not train your thing on our thing.'

Yeah I'm sure the companies that shrug off copyright concerns for Disney movies give a shit about Tencent's pre-laundered intellectual property.

[-] dharmacurious@slrpnk.net 28 points 7 months ago

Okay, help me out here. I've heard people talking about open source ai models, and it always seems like open source needs big ass air quotes. Are there any open source models that are actually open source in the way people generally think of the term?

[-] dovah@lemmy.world 35 points 7 months ago

Here's a list of open source models: open-llms

Models are only open source if the weights are freely available along with the code used to generate them.

[-] morrowind@lemmy.ml 8 points 7 months ago

I would argue to be truly open source the training data needs to be as well.

[-] dharmacurious@slrpnk.net 7 points 7 months ago

I really appreciate that! I was asking more for the information of it, I doubt I could do anything with the link. Lol. I don't understand thing 1 about this stuff. I don't even know wtf a weight is in this context lol

[-] edinbruh@feddit.it 7 points 7 months ago* (last edited 7 months ago)

In this context "weight" is a mathematical term. Have you ever heard the term "weighted average"? Basically it means calculating an average where some elements are more "influent/important" than others, the number that indicates the importance of an element is called a weight.

One oversimplification of how any neural network work could be this:

The NN receives some values in input
The NN calculates many weighted averages from those values. Each average uses a different list of weights.
The NN does a simple special operation on each average. It's not important what the operation actually is, but it must be there. Without this, every NN would be a single layer. It can be anything except sums and multiplications
The modified averages are the input values for the next layer.
Each layer has different lists of weights.
In reality this is all done using some mathematical and computational tricks, but the basic idea is the same.

Training an AI means finding the weights that give the best results, and thus, for an AI to be open-source, we need both the weights and the training code that generated them.

Personally, I feel that we should also have the original training data itself to call it open source, not just weights and code.

[-] MrMcGasion@lemmy.world 4 points 7 months ago

Absolutely agree that to be called open source the training data should also be open. It would also pretty much mean that true open source models would be ethically trained.

[-] Vex_Detrause@lemmy.ca 1 points 7 months ago

What does the open source training data include? I've read a few open source training data that is also tested for biased but I haven't really looked at them.

[-] dharmacurious@slrpnk.net 1 points 7 months ago

Thank you!

And yeah, it really does seem like the training data should be open. Like, not even just to be considered open source, just to be allowed to do this at all, ethically, the training data should be known, at least to some degree. Like, there's so much shit out there, knowing what they trained on would help make some kind of ethical choice in using it

[-] dovah@lemmy.world 1 points 7 months ago

Yeah, good call. Training data should be available as well.

[-] veroxii@aussie.zone 3 points 7 months ago

And as I understand it these Chinese "open source" models are only the weights? No way to "compile" your own version.

[-] dovah@lemmy.world 1 points 7 months ago

I'm not sure what you mean about Chinese models, but you can find the code used for training. Open Llama, for example, gives you the weights, the data, and the code used for training. You can do everything yourself, if you wanted to. The hardest part is getting the appropriate hardware.

[-] Radiant_sir_radiant@beehaw.org 4 points 7 months ago* (last edited 7 months ago)

The closest one to true FOSS that I'm aware of is Apertus. Not sure whether it's feasible to build anything meaningful from scratch without your own GPU farm though.

[-] mindbleach@sh.itjust.works 2 points 7 months ago

Do such models exist? Yes. Are they the big-boy models anyone's really using? Ehhh not really.

There are in-use models that are "here's a thing do whatever good luck," which is at least as open-source as any MIT project. (Permissive licenses being "here is the code, have a nice life.") Very few models are properly reproducible, because even when their training data includes DVDs you probably own, it also includes a ton of random internet pages that maybe don't exist anymore. The push for ever-larger models, trained on as much stuff as possible, makes the use of "open source" regrettable or even deceptive choice. But quite a few are unrestricted for whatever weird shit you want to get up to.

[-] Jankatarch@lemmy.world 2 points 7 months ago

I mean you could give the randomizer seed along with the code for training I guess that would count kinda?

this post was submitted on 14 Oct 2025

784 points (99.2% liked)

Memes

16012 readers

698 users here now

Post memes here.

A meme is an idea, behavior, or style that spreads by means of imitation from person to person within a culture and often carries symbolic meaning representing a particular phenomenon or theme.

An Internet meme or meme, is a cultural item that is spread via the Internet, often through social media platforms. The name is by the concept of memes proposed by Richard Dawkins in 1972. Internet memes can take various forms, such as images, videos, GIFs, and various other viral sensations.

Wait at least 2 months before reposting
No explicitly political content (about political figures, political events, elections and so on), !politicalmemes@lemmy.ca can be better place for that
Use NSFW marking accordingly

Laittakaa meemejä tänne.

Odota ainakin 2 kuukautta ennen meemin postaamista uudelleen
Ei selkeän poliittista sisältöä (poliitikoista, poliittisista tapahtumista, vaaleista jne) parempi paikka esim. !politicalmemes@lemmy.ca
Merkitse K18-sisältö tarpeen mukaan

founded 3 years ago

MODERATORS

QuentinCallaghan@sopuli.xyz

seahorse@midwest.social

graphito@sopuli.xyz