544
tech (feddit.org)
you are viewing a single comment's thread
view the rest of the comments
[-] Tja@programming.dev 0 points 2 days ago

I disagree for various reasons:

It's not very uncommon, it would be an issue as soon as it happens, without going back that far. Even if it was uncommon, it is possible and something to take care of, making for a super ugly "special case" code.

Plus you don't need to sort the user's ids to deliver messages, it's a foreach kind of operation.

And finally, given the underlying hardware, sorting 8 bit integers wouldn't be faster than sorting 64 bit ones (which we don't need to do, anyway), processors move all bits in parallel. Unless WhatsApp runs on 8 bit microcontrollers.

[-] chonglibloodsport@lemmy.world 1 points 2 days ago* (last edited 2 days ago)

I didn’t say sorting, I said “storting” and must have corrected the typo while you were writing your reply. I meant storing. Having a 64-bit UUID attached to every single one of trillions of messages (per day) is a huge amount of wasted space (72TB per trillion messages, just to store 64-bit UUIDs without any message contents).

As an annoying aside, my phone now thinks “storting” is a word and helpfully autocorrects storing to that now. Good grief!

[-] JackbyDev@programming.dev 2 points 1 day ago

I nominate storting to mean storing and sorting at the same time. Like in a binary heap, binary tree, sorted array, etc. It's a common thing and similar to other words like "upsert".

[-] Tja@programming.dev 1 points 2 days ago

I don't see how a message uuid is related to the group membership storage...

I haven't seen the code of WhatsApp, obviously, but I use a similar question to interview candidates. There's a few ways of implementing groups, and you have to store group membership somehow, but just once per group.

When a message is sent, it can be stored with a foreign key that relates it to the group, a message ID that should be unique for whatever DB is in used, plus a timestamp. When checking new messages, a client provides the timestamp of the last retrieved message and the server provides all messages since then (per group). Even read confirmations can be implemented using timestamps. There's no need of storing all group members for every message (not that you claimed it is, just making sure).

[-] chonglibloodsport@lemmy.world 1 points 2 days ago

Sounds like you’re not storing who sent each message to the group, so how is anyone supposed to follow a group conversation between multiple participants if all you’re storing is the group ID and a time stamp?

[-] Tja@programming.dev 1 points 2 days ago

Oh, you mean the sender id. I would definitely store the uuid, but I understand the tradeoff storing something smaller. However due to code complexity of reusing ids and small relative savings (even less with compression) I would definitely prefer the uuid solution.

[-] chonglibloodsport@lemmy.world 1 points 2 days ago

This is how I would do it (and I think how it’s done but can’t confirm):

There’s really no complexity at all because you can just store a table of group members with 256 entries and send the index into that table with each message to each user. The users have a copy of the table on their client and when they receive the message the client looks it up in the table and stores it in the local message history.

You would not store message history on the server. Only messages which have not been delivered to all group members would be stored on the server. When people leave/join the group, you send group membership notices to all members and their clients update their tables accordingly.

Since you don’t store message histories on the server, new people who join the group can’t see messages that were sent before they joined. This eliminates the need to send UUIDs with every message and furthermore it eliminates the need to send large message histories all at once when someone joins a group. Since clients store their own histories with UUIDs attached to messages (not table indices) there is no issue with table index reuse.

[-] Tja@programming.dev 1 points 2 days ago* (last edited 2 days ago)

Disclaimer: I don't use WhatsApp, mostly slack at work and signal personally.

Then the tradeoff is that you can't rely on the server for replay. What if you have two clients, desktop and mobile for instance? A message is delivered to the desktop while the phone is offline, I shut down the computer, turn on the phone and I won't see the message on there. All to save 7 bytes on a message of potentially hundreds? Weird tradeoff. Even less than 7, given compression.

[-] chonglibloodsport@lemmy.world 1 points 1 day ago* (last edited 1 day ago)

I think it’s helpful to bring up a bit about WhatsApp’s history.

WhatsApp was developed in 2009 (for the iPhone) to provide status notifications (Away, Busy, At Work, etc) back when SMS was the only way to message people on phones and SMS did not have such statuses. It soon morphed into a drop-in replacement for SMS messaging which helped it take off in many countries around the world where SMS delivery fees were extremely expensive but small (<1 GB) data plans were cheap (or relatively cheap, on a per-byte basis).

For most of that early history and rapid growth there was no desktop app, only the phone app. You didn’t need to create an account either: your phone number was your account. This model meant that you didn’t want someone else receiving all your messages just because they inherited your phone number, so server-side history was a non-starter to begin with. I think at some point they added the ability to backup your chat history from your phone to a cloud account such as iCloud or Google Drive.

When they launched the desktop version of WhatsApp they tied it to your phone. You had to use your phone to sign in and if your desktop lost connection to the server it could not reconnect by itself.

Anyway, if you think about users in countries like India or Brazil where SMS messages were either unavailable or cost a fortune and data plans were expensive (but still much cheaper than SMS per-byte) then it makes total sense to save as many bytes as possible over the wire. Also consider that WhatsApp’s killer feature, its group chats, are a perfect match for larger families to keep in touch. However, I think even the largest families have the need for fewer than 256 people in one group chat.

The situation for chat history may have changed more recently under the stewardship of Meta / Facebook. I think they have begun to target Slack by marketing WhatsApp Business.

this post was submitted on 12 Aug 2025
544 points (96.9% liked)

Murdered by Words

2171 readers
1 users here now

Responses that completely destroy the original argument in a way that leaves little to no room for reply - a targeted, well-placed response to another person, organization, or group of people.

The following things are not grounds for murder:

Rules:

  1. Be civil and remember the human. No name calling or insults. Swearing in general is fine, but not to insult someone else.
  2. Discussion is encouraged but arguments are not. Don’t be aggressive and don’t argue for arguments sake.
  3. No bigotry of any kind.
  4. Censor the person info of anyone not in the public eye.
  5. If you break the rules you’ll get one warning before you’re banned.
  6. Enjoy the community in the light hearted way it’s intended.

founded 2 years ago
MODERATORS