Speech may have a universal transmission rate: 39 bits per second (www.science.org)

submitted 1 day ago by rssbot@lemmy.bestiver.se to c/hackernews@lemmy.bestiver.se

8 comments fedilink hide all child comments

top 8 comments

sorted by: hot top controversial new old

[-] Windex007@lemmy.world 7 points 1 day ago* (last edited 1 day ago)

The methodology sounds bizarrely complex to me for the purposes of establishing comparative information transfer rate.

Wouldn't just timing how long it takes to communicate a controlled set of information answer that?

I'm confused by the concept of establishing an average "bitrate per syllable" and multiplying that through. Is this trying to address cases where language constructs DEMAND additional information be encoded in speech? Can one not construct a set of information intended to be communicated that could account for those quirks? Find some "lowest common denominator" sentences?

I feel like I'm missing something and I'm very curious about what my faulty assumption is

[-] lvxferre@mander.xyz 3 points 1 day ago

Can one not construct a set of information intended to be communicated that could account for those quirks? Find some “lowest common denominator” sentences?

I think this would require deeper knowledge of all 17 languages in question, and be a potential source of errors - for example, if you include some info in the set that is easier/harder to convey succinctly in one language than in the other languages.

In the meantime, it's easy to get good averages for bits/syllable and syllables/second, even if you don't know the languages in question.

[-] Windex007@lemmy.world 2 points 14 hours ago

I agree there would be challenges around information selectively. I expect Runasimmi can speak more "quickly" "efficiently" about labour-based taxation in the form of terraced plateaus growing cocoa than Inuktitut, but would find itself in deep contrast in the opposite direction speaking of the ice flo route and the associated ice quality a polar bear took hunting a seal.

Also, just because a syllable "encodes more bits on average" does it imply faster transmission rate? Just because French encodes gender information into it's language and syllables, isn't knowing the gender of a shovel at best "check bits?" Used for detecting transmission errors but not intrinsically critical data?

I'm not a linguist. I'm barely a scientist. I'm fascinated by the assertion that it's easy to establish "bits per second" on syllables having somehow abstracted away social context. I'm not saying you're wrong or they're wrong, just that this rubs my naive intuitions exactly the wrong way... Which speaks more to the quality of my intuitions (apparently quite bad) rather than the real science by people actually in the field.

[-] lvxferre@mander.xyz 2 points 12 hours ago

Also, just because a syllable “encodes more bits on average” does it imply faster transmission rate?

If by "faster" you're measuring:

the transmission per syllable - then yes
the transmission per second - then no

This is easier to see in the original paper than in the OP. Check page 3; the second column is the rate of transmission per second, it's roughly 35~45 bits/s for all of them.

Just because French encodes gender information into it’s language and syllables, isn’t knowing the gender of a shovel at best “check bits?” Used for detecting transmission errors but not intrinsically critical data?

At least in theory, redundancy required by [gender, number, case, etc.] agreement shouldn't count, as it isn't adding new information - it's only repeating info already provided. In practice it's hard to model this, so the numbers for gendered languages might be a bit overestimated.

Note however gender has a second role, besides agreement: derivation. Derivation should actually increase bits/second, since it allows you to convey succinctly some stuff.

I’m fascinated by the assertion that it’s easy to establish “bits per second” on syllables having somehow abstracted away social context.

The social context (and the context, as a whole) plays a huge role on that, as paralinguistic information. However the scope there is only the linguistic info, encoded by the language itself.

[-] Windex007@lemmy.world 2 points 10 hours ago* (last edited 10 hours ago)

Note however gender has a second role, besides agreement: derivation

Interesting... I hadn't considered that this might enable linguistic "shorthands", is that the implication?

Sounds to me on the whole like you're saying that the bitrate per syllable is solid and doing the heavy lifting here?

It's super interesting; and the implications are actually huge.

I'd be interested in follow up studies to examine emergent linguistic patterns. Can we weigh syllabic encoding by common usage by age? If we eliminate "thouest" from the dictionary but include "skibidi" how does that skew patterns for informational density?

Science is so fucking cool and I'm stoked that people nerd out on shit that I'm an idiot about so I can learn about the nature of the world.

[-] lvxferre@mander.xyz 1 points 1 hour ago* (last edited 1 hour ago)

Interesting… I hadn’t considered that this might enable linguistic “shorthands”, is that the implication?

Yes, it is! Agreement on its own already allows some "shorthands", if you're able to omit the nouns; but derivation in special allows a lot of them, because it allows you to cram more info into the word at the "cost" of 1~3 phonemes.

I'll give you some examples of that, using Portuguese for my own convenience; do note however you'll see similar stuff popping up in other gendered languages.

First example:

1a. O relógio (M) caiu sobre a mesa (F), e ele (M) quebrou.
1b. O relógio (M) caiu sobre a mesa (F), e ela (F) quebrou.

Both sentences mean "the clock fell over the table, and it broke", but the "ele" (he/it) in 1a refers to the clock, and the "ela" (she/it) in 1b to the table. By changing the gender of the pronoun, you can force it to refer to one or another noun, in a rather succinct way you wouldn't be able to do in a non-gendered language like English. (I feel like "it" would refer to the clock, as the agent of the first phrase, and if you want to refer to the table breaking you'd need to repeat the noun.)

Of course, this "shorthand" only works if both nouns happen to have different genders, but it's already enough to cram a bit more info per syllable. In other cases people use the same strategies as in English.

Second example:

2. Pedro tem dois gatos: uma (F) frajola (F or M) e um (M) malhado (M).

Translated directly, this sentence becomes "Peter has two cats: a tuxedo and a tabby". However the translation doesn't mention the tuxedo is a female, and the tabby a male. In a non-gendered language you'd need to either ditch those pieces of info or explicitly refer to them, and that takes more words.

I’d be interested in follow up studies to examine emergent linguistic patterns. Can we weigh syllabic encoding by common usage by age? If we eliminate “thouest” from the dictionary but include “skibidi” how does that skew patterns for informational density?

The impact for an individual word would be fairly minimal, I think. However, if you're systematically changing sounds or the grammar, like languages often do (cue to "want to", "going to", "trying to" → "wanna", "gonna", "tryna"), the impact will be fairly high. And likely compensated elsewhere, to keep the bits/second ratio roughly the same.

Science is so fucking cool and I’m stoked that people nerd out on shit that I’m an idiot about so I can learn about the nature of the world.

And the fun part is that everybody is an idiot for most topics, except a few individual expertises. We're basically a race of clueless apes trying to make sense of the world.

[-] Zier@fedia.io 2 points 1 day ago

Can you hear me now?

[-] schnurrito@discuss.tchncs.de 2 points 1 day ago

The article is from 2019. I was wondering because my impression was that was pretty old news and common knowledge by now.

this post was submitted on 04 Aug 2025

34 points (100.0% liked)

Hacker News

2200 readers

459 users here now

Posts from the RSS Feed of HackerNews.

The feed sometimes contains ads and posts that have been removed by the mod team at HN.

founded 10 months ago

MODERATORS

patrick@lemmy.bestiver.se

rssbot@lemmy.bestiver.se