492
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 13 Apr 2024
492 points (96.4% liked)
Technology
59366 readers
1253 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
The thing with benchmarks is that they only show you the performance of the type of workload the benchmark is trying to emulate. That’s not very useful in this case. Current PC software is not build with this kind of architecture in mind so it was never designed to take advantage of it. In fact, it’s the exact opposite: since transferring data to/from VRAM is a huge bottleneck, software will be designed to avoid it as much as possible.
For example: a GPU is extremely good at performing an identical operation on lots of data in parallel. The GPU can perform such an operation much, much faster than the CPU. However, copying the data to VRAM and back may add so much additional time that it still takes less time to run it on the CPU, a developer may then choose to run it on the CPU instead even if the GPU was specifically designed to handle that kind of work. On a system with UMA you would absolutely run this on the GPU.
The same thing goes for something like AI accelerators. What PC software exists that takes advantage of such a thing?
A good example of what happens if you design software around this kind of architecture can be found here. This is a post by a developer who worked on Affinity Photo. When they designed this software they anticipated that hardware would move towards a unified memory architecture and designed their software based on that assumption.
When they finally got their hands on UMA hardware in the form of an M1 Max that laptop chip beat the crap out of a $6000 W6900X.
We’re starting to see software taking advantage of these things on macOS, but the PC world still has some catching up to do. The hardware isn’t there yet, and the software always lags behind the hardware.
It’s coming, but Apple is ahead of the game by several years. The problem is that in the PC world no one has a good answer to this yet.
Nvidia makes big, hot, power hungry discrete GPUs. They don’t have an x86 core and Windows on ARM is a joke at this point. I expect them to focus on the server-side with custom high-end AI processors and slowly move out of the desktop space.
AMD has the best papers for desktop. They have a decent x86 core and GPU, they already make APUs. Intel is trying to get into the GPU game but has some catching up to do.
Apple has been quietly working towards this for years. They have their UMA architecture in place, they are starting to put some serious effort into GPU performance and rumor has it that with M4 they will make some big steps in AI acceleration as well. The PC world is held back by a lot of legacy hard and software, but there will be a point where they will have to catch up or be left in the dust.