229

submitted 1 month ago by sabreW4K3@lazysoci.al to c/microblogmemes@lemmy.world

121 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] brucethemoose@lemmy.world 0 points 1 month ago* (last edited 1 month ago)

Only because of brute force over efficient approaches.

Again, look up Deepseek's FP8/multi GPU training paper, and some of the code they published. They used a microscopic fraction of what OpenAI or X AI are using.

And models like SDXL or Flux are not that expensive to train.

It doesn’t have to be this way, but they can get away with it because being rich covers up internal dysfunction/isolation/whatever. Chinese trainers, and other GPU constrained ones, are forced to be thrifty.

[-] HK65@sopuli.xyz 1 points 1 month ago

And I guess they need it to be inefficient and expensive, so that it remains exclusive to them. That's why they were throwing a tantrum at Deepseek, because they proved it doesn't have to be.

this post was submitted on 01 Jul 2025

229 points (98.7% liked)

Microblog Memes

8775 readers

914 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

Please put at least one word relevant to the post in the post title.
Be nice.
No advertising, brand promotion or guerilla marketing.
Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago

MODERATORS

ReadyUser31@lemmy.world

aeronmelon@lemmy.world

needanke@feddit.org