They managed a substantial incremental improvement over previous models by first creating a better set of data as their starting point.
https://huggingface.co/apple/DCLM-7B
152
Apple AI Released a 7B Open-Source Language Model Trained on 2.5T Tokens on Open Datasets.
(www.marktechpost.com)
Is this the one the 'research only' one that was trained on YouTube transcripts including mkbhd?
As someone who knows nothing about this stuff, yes.
Happy cake day, Wang suck dude
As someone who know f-all about AI I support your endorsement, if only by being impressed the training took 2.5 trillion tokes so you know that AI smokes like a banger!
Hopefully they'll be able to put something together that can run locally, so they can finally stop using this "Open"AI bullshit
this post was submitted on 21 Jul 2024
152 points (96.9% liked)
Technology
59340 readers
1834 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS