20
1U mini PC for AI?
(startrek.website)
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam.
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
Submission headline should match the article title.
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
I think the mainboard from the Framework Desktop meets your requirements: https://frame.work/au/en/products/framework-desktop-mainboard-amd-ryzen-ai-max-300-series?v=FRAFMK0002
Pretty sure that's a x4 PCIe slot (admittedly PCIe 5x4, but not many video cards speak PCIe5), would totally trade a usb4 for a x8, but these laptop chips are pretty constrained lanes wise.
It's PCIe 4.0 :(
Indeed. I read Strix Halo only has 16 4.0 PCIe lanes in addition to its USB4, which is resonable given this isn't supposed to be paired with discrete graphics. But I'd happily trade an NVMe slot (still leaving one) for x8.
One of the links to a CCD could theoretically be wired to a GPU, right? Kinda like how EPYC can switch its IO between infinity fabric for 2P servers, and extra PCIe in 1P configurations. But I doubt we'll ever see such a product.
Boo! Silly me thinking DDR5 implied PCIe5, what a shame.
Feels like they're testing the waters with Halo, hopefully a loud 'waters great, dive in' signal gets through and we get something a bit fitter for desktop use, maybe with more memory (and bandwidth) next gen. Still, gotta love the power usage, makes for one hell of a NAS / AI inference server (and inference isn't that fussy about PCIe bandwidth, hell eGPU works fine as long as the model / expert fits in VRAM.
Rumor is it’s successor is 384 bit, and after that their designs are even more modular:
https://www.techpowerup.com/340372/amds-next-gen-udna-four-die-sizes-one-potential-96-cu-flagship
Hybrid inference prompt processing actually is pretty sensitive to PCIe bandwidth, unfortunately, but again I don’t think many people intend on hanging an AMD GPU off these Strix Halo boards, lol.