74

The quality of search engines has gone down so much for technical questions.

I'm looking for a way to index sites like stack exchanges, reddit, quora, and research papers. Would this be possible to do this locally with metadata?

you are viewing a single comment's thread
view the rest of the comments
[-] jcolag@lemmy.sdf.org 6 points 1 year ago* (last edited 1 year ago)

In addition to YaCy and the varieties of Searx (both of which perform better for me than any of the commercial search engines), it's not even out of the question to do this yourself, if you're willing to start with the most recent Common Crawl dump and do some spidering in between releases. I don't recommend it, unless you want to learn for yourself why search engines often give such miserable results, but it's possible.

However, that's the issue, here. Can you self-host a search engine? Sure, if you want to maintain the storage to back it. That depends on how deep your pockets go...

this post was submitted on 31 Jul 2023
74 points (97.4% liked)

Selfhosted

39276 readers
282 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS