19
submitted 1 month ago* (last edited 1 month ago) by nikodindon@lemmy.world to c/selfhosted@lemmy.world

I just pushed v22 of my project : a local AI companion for Radarr, that goes beyond generic genre or TMDb lists.

This isn't "yet another recommender". It's your personal taste explorer that actually gets the vibe you want in natural language and builds recommendations starting from your existing library.

Key highlights from a real recent run:

  • Command: --mood "dystopian films like Idiocracy, Gattaca or In Time"
  • Output: Metropolis (1927), V for Vendetta, Children of Men, Brazil (1985), Minority Report, Dark City, Equilibrium, Upgrade, The Road... → oppressive/surveillance/inequality/societal critique atmosphere, not just "dark sci-fi".

How it works :

  • Starts by sampling random movies from your Radarr collection (or uses your mood/like/saga input).
  • Asks a local Ollama LLM (e.g. mistral-small:22b) for 25 thematic suggestions based on atmosphere/vibe.
  • Validates each via OMDb (IMDb rating, genres, plot, director, cast...).
  • Scores intelligently: IMDb rating + genre match + director/actor bonus + plot embedding similarity (cosine on Ollama embeddings).
  • Adds the top ones directly to Radarr (with confirmation: all / one-by-one / no).
  • Persistent blacklist to avoid repeats.

Different modes :

  • --mood "dark psychological thrillers with unreliable narrators" , any vibe you describe
  • --like "Parasite" --mood "mind-bending class warfare" (or just --like "Whiplash")
  • --saga (auto-detects incomplete sagas in your library and suggests missing entries) or --saga "Star Wars"
  • --director "Kubrick" / --actor "De Niro" / --cast "Pacino De Niro" (movies where they co-star)
  • --analyze → full library audit + gaps (e.g. "You're missing Kurosawa classics and French New Wave")
  • --watchlist → import from Letterboxd/IMDb
  • --auto → perfect for daily cron / Task Scheduler (wake up to 10 fresh additions)

Standout features:

  • 100% local + privacy-first (Ollama + free OMDb API only)
  • No cloud AI, no tracking
  • colored console output, logs, stats, HTML/CSV exports
  • Synopsis preview before adding
  • Configurable quality profile, min IMDb, availability filters
  • Works on Windows, Linux, Mac

GitHub (clean single-file Python script + detailed README):
https://github.com/nikodindon/radarr-movie-recommender

If you're tired of generic Discover lists, Netflix-style randomness, or manual hunting give it a spin. The vibe/mood mode + auto saga completion really change how you expand your collection.

Let me know what you think, any weird mood examples you'd like to test, or features you'd want added!

top 50 comments
sorted by: hot top controversial new old
[-] circuscritic@lemmy.ca 20 points 1 month ago* (last edited 1 month ago)

Since no one is leaving critical comments that might explain all the downvotes, I'm going to assume they're reflexively anti-AI, which frankly, is a position that I'm sympathetic to.

But one of the benign useful things I already use AI for, is giving it criterias for shows and asking it to generate lists.

So I think your project is pretty neat and well within the scope of actually useful things that AI models, especially local ones, can provide the users.

[-] webkitten@piefed.social 5 points 1 month ago

Seriously; local AI use is what everyone should strive for not only for privacy but because it's better than using a large data centre and the power use for Ollama is negligible.

[-] u_tamtam@programming.dev 1 points 1 month ago

LLMs are not the tool for a recommender job

[-] FerCR@kbin.earth 2 points 1 month ago

The local LLM here is, if I'm not mistaken @nikodindon@lemmy.world , just used as a feature extraction tool. It's not like asking ChatGPT what to watch next but rather asking it to sumarise the movie as an excel file, that you then process to compute which movie(s) is(are) similar.

load more comments (16 replies)
[-] eager_eagle@lemmy.world 10 points 1 month ago

that's pretty cool, this is just the wrong crowd, don't worry about the downvotes

[-] nikodindon@lemmy.world 1 points 1 month ago

thanks ! ^^

[-] meldrik@lemmy.wtf 8 points 1 month ago

This is a cool tool. Thanks for sharing. Don’t worry about the downvotes. The Fediverse has a few anti-AI zealots who love to brigade.

[-] nikodindon@lemmy.world 1 points 1 month ago

Thank you ! :)

[-] fubarx@lemmy.world 5 points 1 month ago

The more local inference, the better. Nice work!

[-] timestatic@feddit.org 4 points 1 month ago

Sorry OP that you're getting downvote bombed. This is actually really neat. People go nuts when they hear AI but this is fully local so I think this reaction is unjust. This has nothing to do with ram prices since that stems from data centers or corpos pushing AI on you. Thank you for sharing

[-] Scrath@lemmy.dbzer0.com 4 points 1 month ago

I remember building something vaguely related in a university course on AI before ChatGPT was released and the whole LLM thing hadn't taken off.

The user had the option to enter a couple movies (so long as they were present in the weird semantic database thing our professor told us to use) and we calculated a similarity matrix between them and all other movies in the database based on their tags and by putting the description through a natural language processing pipeline.

The result was the user getting a couple surprisingly accurate recommendations.

Considering we had to calculate this similarity score for every movie in the database it was obviously not very efficient but I wonder how it would scale up against current LLM models, both in terms of accuracy and energy efficiency.

One issue, if you want to call it that, is that our approach was deterministic. Enter the same movies, get the same results. I don't think an LLM is as predictable for that

[-] LiveLM@lemmy.zip 3 points 1 month ago

One issue, if you want to call it that, is that our approach was deterministic. Enter the same movies, get the same results. I don't think an LLM is as predictable for that

Maybe lowering the temperature will help with this?
Besides, a tinge of randomness could even be considered a fun feature.

[-] four@lemmy.zip 1 points 1 month ago

I'm not an expert, but LLMs should still be deterministic. If you run the model with 0 creativity (or whatever the randomness setting is called) and provide exactly the same input, it should provide the same output. That's not how it's usually configured, but it should be possible. Now, if you change the input at all (change order of movies, misspell a title, etc) then the output can change in an unpredictable way

[-] hendrik@palaver.p3x.de 1 points 1 month ago* (last edited 1 month ago)

Yes. I think determinism a misunderstood concept. In computing, it means exact same input leads to always the same output. Could be a correct result or entirely wrong, though. As long as it stays the same, it's deterministic. There's some benefit in introducing randomness to AI. But it can be run in an entirely deterministic way as well. Just depends on the settings. (It's called "temperature".)

[-] pfr@piefed.social 4 points 1 month ago

Anti-AI evangelism is at its peak rn.

[-] Andres4NY@social.ridetrans.it 1 points 1 month ago* (last edited 1 month ago)

@pfr @nikodindon That assumes it won't get worse, which I hope it does. AI companies have forced me to take down web stuff that I had running for almost 2 decades, because their scrapers are so aggressive.

[-] halloween_spookster@lemmy.world 1 points 1 month ago

20 decades

Found the time traveler!

[-] meldrik@lemmy.wtf 1 points 1 month ago

Like what and what have you tried to block it?

[-] Andres4NY@social.ridetrans.it 0 points 1 month ago

@meldrik They're impossible to block based on IP ranges alone. It's why all the FOSS git forges and bug trackers have started using stuff like anubis. But yes, I initially tried to block them (this was before anubis existed).

It was a few things that I had to take down; a gitweb instance with some of my own repos, for example. And a personal photo gallery. The scrapers would do pathological things like running repeated search queries for random email addresses or strings.

[-] meldrik@lemmy.wtf 1 points 1 month ago

I’m hosting several things, including Lemmy and PeerTube. I haven’t really been aware of any scrapers, but do you know of any software that can help block it?

[-] Overspark@piefed.social 2 points 1 month ago

A recommendation for Moonrise Kingdom based on Mickey 17? The genres might match, but those are totally different movies.

[-] Janx@piefed.social 2 points 1 month ago

Also, A Bug's Life from Mickey 17!?

[-] borari@lemmy.dbzer0.com 2 points 1 month ago

That made me lol so hard. Like what’s the fucking point of this thing when it comes up with shit like that?

[-] Alvaro@lemmy.blahaj.zone 1 points 1 month ago

A bugs life from mickey 17?

Explain OP

[-] RIotingPacifist@lemmy.world 0 points 1 month ago

How does this compare to an ML approach?

are you training or just using an LLM for this?

[-] eager_eagle@lemmy.world 4 points 1 month ago* (last edited 1 month ago)

There's no training, the LLM embeddings are used to compare the plots via a cosine similarity, then a simple weighted score with other data sources is used to rank the candidates. There's no training, evaluation, or ground-truth, it's just a simple tool to start using.

[-] FerCR@kbin.earth 2 points 1 month ago

Exactly! This has been done plenty of times in the past (there's a reason why some movies datasets are used as toy example for data analysis). For the unfamiliar with the field, the LLM part here is simply that, instead of building a feature space from predefined tags or variables, it makes a "fuzzier" feature space where it embeds the movies based on the text tokens the model sees. In essence, the way to compute which movie to recommend is the same (a.k.a no LLM) it is just that the data used for the computation is generated differently.

load more comments
view more: next ›
this post was submitted on 20 Mar 2026
19 points (70.2% liked)

Selfhosted

58893 readers
69 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS