view the rest of the comments
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
I've been using a number of different tools which I interface to my nextcloud.
My main nextcloud has a llm plugin which was really easy to install, you just install the plug-in, make sure that you are configured properly with python in your path, and then run an OCC command to download one of a few models.
https://localai.io/
I also hosted localAI, which was a little bit more involved, but the website did a decent enough job of explaining exactly all the things that you needed to do in order to get all the different types of AI model working. Besides LLMs, it also supports text to speech, speech to text, and image generation.
Two things that are important: first, if you are server doesn't have a pretty advanced video card then you're going to be using the CPU exclusively for AI, and that'll be pretty slow. Second, I found it very quickly that the amount of RAM you have is critical. My main server is a core i5 4th gen, and so I put AI software on another one of my servers which is a core i5 7th gen. You would think that the latter would work a lot better, but it had half the ram, and it basically wasn't even able to get started.
Besides hosting ai, if you have a desktop computer or gaming laptop you can run local AI models. There's a fantastic piece of software called Faraday that works pretty well on my laptop. You can get more and more sophisticated models depending on how much memory you have.
https://youtu.be/aLy_vVLUHZk
Krita has AI dal-e support for image generation available as a plug-in. I haven't used it yet because I just got it started downloading last night before I went to bed, but the installation process has defined in the video seems accurate and was extremely easy and mostly automated.
https://youtu.be/AU8NDSBIS1U
Here is an alternative Piped link(s):
https://piped.video/aLy_vVLUHZk
https://piped.video/AU8NDSBIS1U
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source; check me out at GitHub.
Is there an amount of RAM that's currently considered the bare minimum for CPU-only self-hosting?
If you're using llama.cpp, have a look at the GGUF models by TheBloke on huggingface. He puts approximate RAM required in the readme based on the quantisation level.
From personal experience I'd estimate 12G for 7B models based on how full RAM was with 16 gigs. For mixtral at least 32G.
Thanks, appreciate it (I'm new to local text CPU models, I know it was a stupid question).