I recently decided to start taking on the challenge of selfhosting and curating my music collection. I originally started looking at Lidarr as I am already a big fan of Radarr and Sonarr, but it wasn't really what I was looking for. I'm not often seeking out full albums, and am more often finding my music by listening to single tracks from Spotify's Discover Weekly playlist. I needed a solution that would let me replicate this experience while hosting my own MP3's and ideally be entirely automated.
I currently have the following setup running on a VPS:
- Azuracast - This provides me a streaming radio station that cycles through my entire library 24/7
- Navidrome - This fills the gap of the Spotify-like interface where I can play specific tracks, albums, or playlists
I bootstrapped my library with a Python script that parsed a list of Spotify URL's and downloaded all of the tracks with the spotdl library. This allowed me to grab my liked tracks, the playlists I had created, as well as a large number of albums I wanted.
I then used ChatGPT to write two python scripts:
-
The first script runs using cron every Monday and uses SpotDL to grab the contents of my Discover Weekly playlist from Spotify. It puts all of the files into a folder with that weeks date and also creates a playlist file. This way I can easily browse that weeks playlist in Navidrome and decide what to keep. It also sends me an email on completion/error
-
The second script is a bit more complex. This one does the same end result but for all of my LastFM reccomendations. This is done by spinning up a headless Chrome browser with Selenium in a docker container. It then logs into my LastFM account, parses each reccomendation, and then uses pytube to download the video links, since LastFM just directly links to Youtube videos. This list should change as I continue scrobbling via Navidrome and other sources, but I still need to determine how often the cron job should run.
My next step is figuring out how to connect to Azuracast/Navidrome using the many subsonic compatible clients so I can have mobile playback and things like offline playback. I'm currently looking at substreamer for Android.
I'd also like to look into a more seamless way of picking out the tracks I want to keep and discard from the playlists in Navidrome. I'm considering writing something to check its SQL database for liked tracks in each playlist and automatically move those into the main folder/playlist that Azuracast is playing from.
This whole setup took me only a couple days to create, and largely relied on ChatGPT to write the scripts and dockerfiles. I'm a capable programmer but GPT-4 is absolutely OP if you know what you're trying to accomplish and how to debug its mistakes. That Selenium script only took me an hour from idea to completion and I never modified the code by hand, only prompted it for corrections/additions.
If anyone is interested I've uploaded all the scripts to a gist, you just need to go through and update with your credentials/URLs
Please update this if you continue working on it, I’ve been looking for something like this.
Is chrome browser needed? Could this be swapped out for any chromium browser? I try not to use any google services (within reason, still need a gmail for work).
I don't see why it wouldn't be possible to swap out the docker container running Chrome with another that is running Chromium or Firefox. The only interaction with the browser itself is via Selenium, which should be agnostic. I just went with what ChatGPT was able to suggest immediately.
I should clarify this is running a headless browser, so you don't actually need Chrome installed, it exists entirely within the confines of the container and is completely ephemeral. You could also modify this to work with the standard Selenium webdriver and your installed browser of choice, but I made this with the intention of running it on my server rather than my personal machine.
I would also be running on a server if I do this, and I know having it containerised would be fine privacy wise, it was more just curiosity about why you went with chrome. Makes sense that ChatGPT went with chrome though, since it’s the most used browser in the world at the moment.
How is the music quality that’s downloaded determined? (It could be somewhere in your script, haven’t looked at those yet).
Both spotdl and pytube are downloading from Youtube as their source, my understanding is they're able to grab 320kbps audio if it's available. It's no FLAC ripped from CD, but it's good enough for my use case since I don't want to drag torrenting or usenet into my VPS