126

David Sacks claims there's 'substantial evidence' that DeepSeek used OpenAI's models to train its own | TechCrunch (techcrunch.com)

submitted 10 months ago by Zerush@lemmy.ml to c/technology@lemmy.ml

61 comments fedilink hide all child comments

LOL

top 50 comments

sorted by: hot top controversial new old

[-] Lugh@futurology.today 154 points 10 months ago

So the same people who have no problem about using other people's copyrighted work, are now crying when the Chinese do the same to them? Find me a nano-scale violin so I can play a really sad song.

[-] iamericandre@lemmy.world 55 points 10 months ago

[-] wewbull@feddit.uk 25 points 10 months ago

That's obviously a cello.

[-] bradorsomething@ttrpg.network 2 points 10 months ago* (last edited 10 months ago)

Maybe a cello if it was human grade. But that’s tardigrade.

[-] Viri4thus@feddit.org 7 points 10 months ago

Pedant time: That's microscale not nanoscale.

You can shoot me now, it's deserved.

[-] sp3tr4l@lemmy.zip 5 points 10 months ago

Can you put a liuqin in there?

[-] j4k3@lemmy.world 23 points 10 months ago* (last edited 10 months ago)

Planck could not scale small enough.

[-] harsh3466@lemmy.ml 2 points 10 months ago

You more elegantly said what I came to say.

[-] Nadru@lemmy.world 99 points 10 months ago

Stealing from thieves is not theft

[-] admin@lemmy.my-box.dev 27 points 10 months ago* (last edited 10 months ago)

Yes it is. Although I personally have far less moral objections to it.

To elaborate:
OpenAI scraped data without permission, and then makes money from it.

Deepseek then used that data (even paid openai for it), trained a model on that data, and then releases that model for anyone to use.

While it's still making use of "stolen data" (that's a whole semantics discussion I won't get into right now), I find it far more noble than the former.

[-] harsh3466@lemmy.ml 11 points 10 months ago

Came to say something similar. Like I give a fuck that OpenAI's model/tech/whatever was "stolen" by Deepseek. Fuck that piece of shit Sam Altman.

[-] wewbull@feddit.uk 2 points 10 months ago

"Recieving stolen goods" is prosecutable.

It's a lesser crime than the original theft though.

[-] rtxn@lemmy.world 47 points 10 months ago

Cry me a fucking river, David.

[-] Acoustic@lemm.ee 44 points 10 months ago

Bruh, these guys trained their own AI on so called "puplicly available" content. Except it was, and still is, completely without consent from, or compensation to said artists/bloggers/creators etc.. Don't throw rocks when you live in a glass house 🤌

load more comments (1 replies)

[-] notannpc@lemmy.world 43 points 10 months ago

Oh are we supposed to care about substantial evidence of theft now? Because there’s a few artists, writers, and other creatives that would like to have a word with you…

[-] extremeboredom@lemmy.world 41 points 10 months ago

Womp womp. I'm sure openAI asked for permission from the creators for all its training data, right? Thief complains about someone else stealing their stolen goods, more at 11.

[-] j4k3@lemmy.world 37 points 10 months ago

OpenAI's mission statement is also in their name. The fact that they have a proprietary product that is not open source is criminal and should be sued out of existence. They are now just like the Sun Micro after Apache was made open sourced; irrelevant they just haven't gotten the memo yet. No company can compete against the whole world.

load more comments (1 replies)

[-] halcyoncmdr@lemmy.world 35 points 10 months ago

Your point OpenAI? Weren't you part of the group saying training AI wasn't copyright infringement? Not so happy when it's your shit being copied? Huh. Weird.

[-] qarbone@lemmy.world 4 points 10 months ago* (last edited 10 months ago)

The only concern is how much the cost of training the model changes if it got a significant kickstart from previous, very-expensive training. I was interested because it was said to be comparable for a fraction of the cost. "Open"AI can suck sand.

[-] vfreire85@lemmy.ml 30 points 10 months ago

so he's just admitting that deepseek did a better job than openai but for a fraction of the price? it only gets better.

[-] dawnglider@lemmy.ml 18 points 10 months ago* (last edited 10 months ago)

It's funny that they did all that and open-sourced it too. Like some kid accusing another to copy their homeworks while the other kid did significantly better and also offered to share.

[-] conicalscientist@lemmy.world 29 points 10 months ago

When you can't win, accuse them of cheating.

[-] barnaclebutt@lemmy.world 13 points 10 months ago

But, but they committed the copyright infringement first. It's theirs. That's totally unfair. What are tech bros going to do? Admit they are grossly over valued? They've already spent the billions.

[-] mctoasterson@reddthat.com 27 points 10 months ago

Here's the thing... It was a bubble because you can't wall off the entire concept of AI. This revelation was just an acceleration displaying what should've been obvious.

There are many many open models available for people to fuck around with. I have in a homelab setting, just to keep abreast of what is going on, get a general idea how it works and what its capable of.

What most normie followers of AI don't seem to understand is, whether you're doing LLM or machine learning object detection or something, you can get open software that is "good enough" and run it locally. If you have a raspberry pi you can run some of this stuff, and it will be slow, but acceptable for many use cases.

So the concept that only OpenAI would ever hold the keys and should therefore have massive valuation in perpetuity, that is just laughable. This Chinese company just highlighted that you can bruteforce train more optimized models on garbage-tier hardware.

[-] Zerush@lemmy.ml 1 points 10 months ago* (last edited 10 months ago)

Yes, AI arrived to stay, that is fact, also soon there will be a unified global AI (see Stargat Project) which will make obsolet all other LLM. A $500 Billon project. The only thing we can do is to use it in the most intelligent way, avoiding it will be impossible.

[-] lobut@lemmy.ca 27 points 10 months ago* (last edited 10 months ago)

Me pretending to care about David Sacks claim:

Open AI CTO making a stupid face after being asked if they steal

[-] thefluffiest@feddit.nl 22 points 10 months ago

FUD, just to distract from the crushing multibillion dollar defeat they’ve just been dealt. First stage of grief: denial. Second: anger. Third: bargaining. We’re somewhere between 2 and 3 right now.

[-] KeenFlame@feddit.nu 2 points 10 months ago

Nope, it's definitely true, but sensationalism. Almost all models are trained using gpt

[-] Embargo@lemm.ee 22 points 10 months ago

I couldn't give less of a fuck.

[-] ToadOfHypnosis@lemmy.ml 18 points 10 months ago

Open AI stole all of our data to train their model. If this is true, no sympathy.

[-] Zerush@lemmy.ml 2 points 10 months ago

That is what I mean, it's a difference between an AI with robbed content in its knowledge/lenguage base and an AI assistant which only search iformation in the web to answer, linking to the corresponding pages. Way more intelligent and ethic use of an AI.

[-] dumbass@leminal.space 17 points 10 months ago

I think chatgpt is more self aware than OpenAI is.

[-] Etterra@discuss.online 17 points 10 months ago

Yeah, and? WTF are you chuckle fucks gonna do about it? Whine and complain without a hint of irony? Because that's all this is. Because there's not a goddamn thing you can do about it, and you hate that. Not to mention the hit to your nearly bottomless wallets. Cry more emo kid, your suffering sustains me.

[-] TomMasz@lemmy.world 16 points 10 months ago

Copycat gets copycatted.

[-] absquatulate@lemmy.world 13 points 10 months ago

It's only ok when we do it, cause we're the good guys!

[-] some_guy@lemmy.sdf.org 13 points 10 months ago

Yes, so what?

https://stratechery.com/2025/deepseek-faq/

Who the fuck cares? They're all doing this.

[-] horse_battery_staple@lemmy.world 10 points 10 months ago

They're obviously trying so hard for regulatory capture in the states it's embarrassing.

[-] yogthos@lemmy.ml 10 points 10 months ago

If there's one thing we know about American AI companies it's that they have a spotless record when it comes to data ethics. Never touched unauthorized data. Swear! Not even once. Of course not.

[-] veroxii@aussie.zone 8 points 10 months ago

Well you can't run openai's models yourself so pretty sure deepseek would've had to pay for API access. How is that stealing again?

[-] OhStopYellingAtMe@lemmy.world 7 points 10 months ago

They’re eating each other.

[-] P00ptart@lemmy.world 3 points 10 months ago

Exactly. This is the end. The companies eat each other while we suffer.

[-] KeenFlame@feddit.nu 4 points 10 months ago

So what? It's absolutely true and makes absolutely no difference to anyone

[-] zaft@lemmy.world 4 points 10 months ago* (last edited 10 months ago)

Whoop de doo

[-] mindbleach@sh.itjust.works 4 points 10 months ago

Training is transformative use. Same reason I don't care which DVDs they show the draw-anything robot.

If I somehow stole ChatGPT's weights and pruned them to one-tenth their size, that'd be on-par with leaking the source code to a game. Any support would be yo-ho-ho vigilante justice kinds of support.

But I just point my chatbot at your chatbot, and mine winds up better and smaller... tough shit.

[-] dx1@lemmy.ml 2 points 10 months ago

Good luck suing them

[-] Zerush@lemmy.ml 2 points 10 months ago

I still prefer Andisearch over all others, it was rhe first AI search long before all other were released, with own LLM not a copy from others.

load more comments (13 replies)

load more comments

this post was submitted on 29 Jan 2025

126 points (92.6% liked)

Technology

40351 readers

38 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago

MODERATORS

MinutePhrase@lemmy.ml