40
An Analysis of DeepMind's 'Language Modeling Is Compression' Paper
(codeconfessions.substack.com)
This is a most excellent place for technology news and articles.
No, because our brains also use hierarchical activation for association, which is why if we're talking about bugs and I say "I got a B" you assume its a stinging insect, not a passing grade.
If it was simple word2vec we wouldn't have that additional means of noise suppression.