228

Any “small-web” search engines? (lemmy.zip)

submitted 2 years ago by dch82@lemmy.zip to c/technology@lemmy.world

75 comments fedilink hide all child comments

I use Duckduckgo, but I realised these big(ish) search engines give me all the commercialised results. Duckduckgo has been going down the slope for years, but not at such a rate as Google or Bing has.

I want to have a search engine that gives me all the small blogs and personal sites.

Does something like this exist?

you are viewing a single comment's thread
view the rest of the comments

[-] sxan@midwest.social 1 points 2 years ago

I'm designing off the top of my head, but I think you could do it with a DHT, or even just steal some distributed ledger algorithm from a blockchain. Or, you develop a distributed skip tree -- but you're right, any sort of distributed query is going to have a possibly unacceptable latency. So you might -- like Bitcoin -- distributed the index itself to participants (which could be large), but federate the indexing operation s.t. rather than a dozen different search engine crawlers hitting each web site, you'd have one or two crawlers per site feeding the shared index.

Distributed search engines have existed for over a decade. Several solutions for distributed Lucene clusters exist (SOLR, katta, ElasticSearch, O2) and while they're mostly designed to be run in a LAN where the latencies between nodes is small, I don't think it's impossible to imagine a fairly low-latency distributed, replicated index where the nodes have a small subset of peer nodes which, together, encompass the entire index. No instance has the same set of peer nodes, but the combined index is eventually consistent.

Again, I'm thinking more about federating and distributing the index-building, to reduce web sites being hammered by search engines which constitute 80% of their traffic. Federating and distributing the query mechanism is a harder problem, but there's a lot of existing R&D in this area, and technologies that could be borrowed from other domains (the aforementioned DHT and distributed ledger algorithms).

this post was submitted on 19 Aug 2024

228 points (96.0% liked)

Technology

85043 readers

1941 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws