100
1-bit LLMs Could Solve AI’s Energy Demands
(spectrum.ieee.org)
This is a most excellent place for technology news and articles.
Isn't this true of standard multi-bit neural networks too? This seems to be what a nonlinear activation function achieves: translating the input values into an all-or-nothing activation.
The characteristic of a 1-bit model is not that its activations are recorded in a single but but that its weights are. There are no gradations of connection weights: they are just on or off. As far as I know, that's different from both standard neural nets and from how the brain works.