48
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 01 Dec 2024
48 points (88.7% liked)
Futurology
3136 readers
123 users here now
founded 2 years ago
MODERATORS
Genuine question: how energy intensive is it to run a model compared to training it? I always thought once a model is trained it's (comparatively) trivial to query?
For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.
With specialty hardware maybe 10x less.
A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.
Wouldn't running on a CPU (while possible) make it less energy efficient, though?
It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.