69
you are viewing a single comment's thread
view the rest of the comments
[-] yogthos@lemmy.ml 16 points 1 day ago* (last edited 1 day ago)

I expect that software will continue to get optimized, and we'll see new algorithms that are more efficient than what people are doing currently. However, it's possible we'll start seeing hardware specifically built for models as well. For example, there's already a startup that uses ASIC chips to print the model directly to the chip. Since each transistor acts as a state, it doesn't need DRAM and the whole chip requires a small amount of SRAM which isn't in short supply right now https://www.anuragk.com/blog/posts/Taalas.html

The limitation with this approach is that the chip is made for a specific model, but that's not really that different from the way regular chips work either. You buy a chip and if it does what you need, it keeps working. When new models come out, new chips get printed, and if you need the new capabilities then you upgrade.

You can see how absurdly fast their hardware version of llama 3 is here https://chatjimmy.ai/

[-] GnuLinuxDude@lemmy.ml 3 points 1 day ago

That is indeed absurdly fast.

this post was submitted on 23 Apr 2026
69 points (97.3% liked)

Asklemmy

54075 readers
369 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 7 years ago
MODERATORS