29
Can you self-host AI at parity with chatgpt?
(lemmy.ml)
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
Looking for support?
Looking for a community?
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
It's all dependent on VRAM. If you can load the distilled models with your GPU without maxing out your VRAM it will run just as fast as any server farm.
It looks like your video card only has 8 GB of VRAM. That will be your bottleneck.
Also no ROCm support afaik, so it's running completely on CPU
yah that'll do it too. ive got a 6800xt which isn't technically supported but it works well.