387
I watched Nvidia's Computex 2024 keynote and it made my blood run cold
(www.techradar.com)
This is a most excellent place for technology news and articles.
You still need a massive fleet of these to train those multi-billion parameter models.
On the invocation side, if you have a cloud SaaS service like ChatGPT, hosted Anthropic, or AWS Bedrock, these could answer questions quickly. But they cost a lot to operate at scale. I have a feeling the bean-counters are going to slow down the crazy overspending.
We're heading into a world where edge computing is more cost and energy efficient to operate. It's also more privacy-friendly. I'm more enthused about a running these models on our phones and in-home devices. There, the race will be for TOPS vs power savings.