11
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 10 Jan 2025
11 points (92.3% liked)
LocalLLaMA
2410 readers
27 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 2 years ago
MODERATORS
In case of LLM's you should look at AirLLM. I suppose there is no conviniet integrations to local chat tools, but issue at Ollama already started.
this is useless, llama.cpp already does that airllm does (offloading to CPU) but its actually faster. so just use ollama
That looks like exactly the sort of thing i want. Any existing solution to get it to behave like an ollama instance (i have a bunch of services pointed at an ollama run on docker).
You may try Harbor. The description claims to provide an OpenAI-compatible API.