98

For years I’ve had a dream of building a rack mounted PC capable of splitting its resources to host multiple GPU intensive VMs:

  • a few gaming VMs
  • a VM for work that can run Davinci Resolve and Blender renders
  • an LLM server
  • a Stable Diffusion server
  • media server

Just to name a few possibilities…

Everytime I’ve looked into it, it seemed like the technology just wasn’t there yet. I remember a few years ago Linus TT took a shot at it, but in the end suggested the technology (for non-commercial entities) just wasn’t in a comfortable spot yet.

So how far off are we? Obviously AI focused companies seem to make it work, but what possibilities exist for us self-hosters who might also want to run multiple displays in addition to the web gui LLM servers? And without forking out crazy money for GPU virtualization software licenses?

you are viewing a single comment's thread
view the rest of the comments
[-] Takumidesh@lemmy.world 9 points 5 months ago

None of the presented solutions cover the aspect of being in a different place than the rack, the same network is fine, but at a minimum a different room.

How do you deliver high resolution (e.g. 1440p, 144 fps) to multiple monitors with low latency over a network? I haven't seen anything like that accomplished without running fiber from the host.

Eventually, your thin client will need too much power anyway, making the costs rise a lot. It makes sense in an office where you have 500 seats and you can load balance resources.

If someone can show me a multi seat gaming server that has native remote performance (as in you drag windows around in 144 fps, not the standard artifacty high latency behavior of vnc) I'll eat a shoe.

[-] mesamunefire@lemmy.world 7 points 5 months ago

Yep just ping time and latency make this a no go for a vast majority of us.

[-] jet@hackertalks.com 0 points 5 months ago* (last edited 5 months ago)

Can you define what acceptable latency would be?

local network ping (like corporate networks) 1-2ms

Encoding and decoding delay 10-15ms

So about ~20ms of latency

Real world example

[-] yggstyle@lemmy.world 1 points 5 months ago* (last edited 5 months ago)

None of the presented solutions cover the aspect of being in a different place than the rack, the same network is fine, but at a minimum a different room.

If someone can show me a multi seat gaming server that has native remote performance (as in you drag windows around in 144 fps, not the standard artifacty high latency behavior of vnc) I'll eat a shoe.

Thin clients absolutely can do this already. There are a variety of ways to transmit low latency video around a home from HDBaseT solutions to multicast / network driven ones. Nevermind basic solutions like sunshine /moonlight... Nvidia variants etc.

I have a single racked PC for feeding my home which has 3 'desk' endpoints and two tvs... all of which are fed from the same location and can be dynamically matrixed (albeit the choke point is usb2 to each location because I'm cheap.). Latency is maybe 1.5-3 frames from live. Other solutions are normally around 5-8 which while higher are sufficiently snappy and won't effect competitive play (professional level notwithstanding.)

A lot of latency comes down to tuning your solution and research. The vnc method you refer to is the lowest common denominator running on ancient technology and codecs simply because it is a widely supported standard.

Edit: As far as 144 goes- I don't have any displays that run that but I have two running at 120 with no issue.

[-] Takumidesh@lemmy.world 2 points 5 months ago

What is the cost of the thin clients and are you doing this over copper?

Are your desks multi monitor? To get the bare minimum in my households scenario I would need at least 12 streams at greater than 1080p

For 5 seats how much did it cost versus just having a computer in each location? For example looking at hdbaset to replace just my desk setup, I would need 4 ~$350 devices, just looking at monoprice for an idea (https://www.monoprice.com/product?p_id=21669) which doesn't even cover all of the screens in my office.

[-] yggstyle@lemmy.world 3 points 5 months ago* (last edited 5 months ago)

The two workstation nooks (spaces) have the capability to have a second monitor but I've since retired them in favor of ultrawide monitors which I find are a better experience in general. My current working solution is a split between two technologies: one thin client (second monitors) and one network distribution solution using multicast (primary displays and USB). Both run on copper 1 gig but the multicast traffic requires a switch that doesn't suck and vlan usage. On average a single port can reach 70-85% usage sustained. I believe my longest run is 150' ish.

Cost per node is roughly 300- so comparable to what you are experiencing. If I went stupid cheap I could probably cut that to maybe 150-250 depending on my luck with eBay and patience.

In terms of capabilities you could argue that this could be done without distribution using a nuc solution... but you'd have to split resources to reach node you'd need a full feature set at.

My central server is a threadripper build with 2 gpus for direct passthrough to 'gaming' vms and a split gpu handling the rest of the needs of the other systems. Thanks to the matrix capabilities any given seat can be any system... or in some cases 2 seats can be a single rig (2 room gaming off the same display). There is a cost savings to be found in splitting resources from a more expensive build out to cheaper nodes... but ymmv depending on active seats and specific needs. I believe as a general rule it should be less costly and more efficient (power/heat) than individual solutions.

[-] Takumidesh@lemmy.world 2 points 5 months ago

Thanks for the breakdown! This is probably the most helpful breakdown I've seen of a build like this.

[-] yggstyle@lemmy.world 2 points 5 months ago

Absolutely 👍. I'll just add that there are a lot of alternate routes to get the result you want so research and experiment but ideally set a deadline which can help with decision paralysis. Later changes are a problem for future you 😁.

[-] jet@hackertalks.com 0 points 5 months ago

Fiber isn't some exotic never seen technology, its everywhere nowadays.

Moonlight literally does what you want, today! using hvec encoding straight in the gpu.

Try it out on your own network now.

[-] Takumidesh@lemmy.world 3 points 5 months ago

A display port to fiber extender is $2,000. The fiber is not for the network.

Moonlight does not do what I want, moonlight requires a GPU on the thin client to decode. You would need a high end GPU to decide multiple high resolution video streams. Also afaik, moonlight doesn't support multiple displays.

[-] jet@hackertalks.com 1 points 5 months ago

Fair enough. If you know it doesn't work for your use case that's fine.

As demonstrated elsewhere in this discussion, GPU HEVC encoding only requires 10ms of extra latency, then it can transit over fiber optic networking at very low latency.

Many GPUs have HEVC decoders on board., including cell phones. Most newer Intel and AMD CPUs actually have an HEVC decoder pipeline as well.

I don't think anybody's saying a self-hosted GPU VM is for everybody, but it does make sense for a lot of use cases. And that's where I think our schism is coming from.


As far as the $2,000 transducer to fiber.. it's doing the same exact thing, just more specialized equipment maybe a little bit lower latency.

this post was submitted on 15 Jun 2024
98 points (95.4% liked)

Selfhosted

40438 readers
218 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS