TLDR:
I want to produce a markdown, HTML or PDF file for every fetched article of an RSS feed for archival purposes. If we fetch new articles with RSS, why not have the option to produce a local copy of the articles while we are at it?
Currently, most RSS readers fetch the articles and save a database of articles as some kind of file such as SQlite. This database file is specific to the software in use (Liferea, etc.), therefore not portable. Also, images files are either only temporarily saved as a cache and are therefore not visible when viewing the articles offline, or they are saved in an unorganized way (and often renamed).
This is fine for people who just want to read the news or new blogposts. But I want to save those new blogposts in a portable format and be able to read them whenever from any offline device. Basically, I want something like "SingeFile" (github.com/gildas-lormeau/SingleFile) for RSS feeds, automatically (or manually but easily) making a new file for every new article when fetching an RSS feed.
So, is there any open-source RSS software that saves the articles as separate, portable files (such as markdown, HTML, PDF, epub), or at least allows bulk exporting the articles as such?
So close yet so far:
"Newsflash" (from Flathub) allows exporting a HTML file of an article (without the images). This is almost what I want, but Newsflash doesn't allow selecting multiple articles and queing them to make HTML copies (tedious to do it one by one), nor does it save the images like SingleFile does.
Rationale:
The desired mirroring/scraping function (producing the Markdown/HTML/PDF file from an article) can piggyback from fetching the RSS feed, instead of having to scrape a site/blog separately. Since we already fetch an article when we use RSS, we should be able to locally parse the article to produce a markdown (if text only), HTML or PDF file. This saves both bandwith and hardware ressources.
It also allows to easily produce a file only for new articles, because RSS inherently appends new articles to your feed list, instead of having to manually specify what to download or not download if scraping manually.