188
submitted 2 weeks ago by misk@sopuli.xyz to c/technology@lemmy.world
you are viewing a single comment's thread
view the rest of the comments
[-] pufferfischerpulver@feddit.org 9 points 2 weeks ago

Interesting you focus on language. Because that's exactly what LLMs cannot understand. There's no LLM that actually has a concept of the meaning of words. Here's an excellent essay illustrating my point.

The fundamental problem is that deep learning ignores a core finding of cognitive science: sophisticated use of language relies upon world models and abstract representations. Systems like LLMs, which train on text-only data and use statistical learning to predict words, cannot understand language for two key reasons: first, even with vast scale, their training and data do not have the required information; and second, LLMs lack the world-modeling and symbolic reasoning systems that underpin the most important aspects of human language.

The data that LLMs rely upon has a fundamental problem: it is entirely linguistic. All LMs receive are streams of symbols detached from their referents, and all they can do is find predictive patterns in those streams. But critically, understanding language requires having a grasp of the situation in the external world, representing other agents with their emotions and motivations, and connecting all of these factors to syntactic structures and semantic terms. Since LLMs rely solely on text data that is not grounded in any external or extra-linguistic representation, the models are stuck within the system of language, and thus cannot understand it. This is the symbol grounding problem: with access to just formal symbol system, one cannot figure out what these symbols are connected to outside the system (Harnad, 1990). Syntax alone is not enough to infer semantics. Training on just the form of language can allow LLMs to leverage artifacts in the data, but “cannot in principle lead to the learning of meaning” (Bender & Koller, 2020). Without any extralinguistic grounding, LLMs will inevitably misuse words, fail to pick up communicative intents, and misunderstand language.

[-] daniskarma@lemmy.dbzer0.com 0 points 2 weeks ago* (last edited 2 weeks ago)

But this "concepts" of things are built on the relation and iteration of this concepts with our brain.

A baby doesn't born knowing that a table is a table. But they see a table, their parents say the word table, and they end up imprinting that what they have to say when they see that thing is the word table. That then they can relation with other things they know. I've watched some kids grow and learn how to talk lately and it's pretty evident how repetition precedes understanding. Many kids will just repeat words that they parents said in certain situation when they happen to be in the same situation. It's pretty obvious with small kids. But it's a behavior you can also see a lot with adults, just repeating something they heard once they see that particular words fit the context

Also it's interesting that language can actually influence the way concepts are constructed in the brain. For instance ancient greeks saw blue and green as the same colour, because they did only have one word for both colours.

this post was submitted on 27 Dec 2024
188 points (94.8% liked)

Technology

60363 readers
916 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS