Multiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy. (arxiv.org)

submitted 8 months ago by Lugh@futurology.today to c/futurology@futurology.today

27 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] kippinitreal@lemmy.world 8 points 8 months ago

Genuine question: how energy intensive is it to run a model compared to training it? I always thought once a model is trained it's (comparatively) trivial to query?

[-] DavidGarcia@feddit.nl 6 points 8 months ago

For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

With specialty hardware maybe 10x less.

[-] pennomi@lemmy.world 3 points 8 months ago

A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

[-] copygirl@lemmy.blahaj.zone 3 points 8 months ago

Wouldn't running on a CPU (while possible) make it less energy efficient, though?

[-] pennomi@lemmy.world 3 points 8 months ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.

load more comments (1 replies)

load more comments (2 replies)

load more comments (5 replies)

this post was submitted on 01 Dec 2024

48 points (88.7% liked)

Futurology

3136 readers

123 users here now

founded 2 years ago

MODERATORS

voidx@futurology.today

Lugh@futurology.today

Espiritdescali@futurology.today

AwesomeLowlander@futurology.today