61
It turns out you can train AI models without copyrighted material
(www.engadget.com)
On the road to fully automated luxury gay space communism.
Spreading Linux propaganda since 2020
Rules:
My question is, is this dataset also Free Range or Cage Free?
Cage-free, as hasn't been around long enough to be in publicly owned data
Is 8tb even shit for data? I thought these things needed to feed on hundreds of terra-bytes of data
It's a bit weird to refer to it in terabytes, reading the paper their biggest model was trained on 2 trillion tokens. Qwen 3 was pre trained on 36t, with post training on top of that. It's kinda fine for what it is but this absolutely contributes to its poor performance.