994
how things become science
(lemmy.blahaj.zone)
A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.

Rules
This is a science community. We use the Dawkins definition of meme.
There are huge public datasets that are often used for pretraining. Common Crawl and C4 are probably the most prominent, but there are others.
There are also big public datasets available for fine-running and instruction tuning.
The open weight models are getting pretty powerful, thanks to some Chinese labs.