Ask HN: Can we create a new internet where search engines are irrelevant? (news.ycombinator.com)

submitted 1 year ago by prl@lemmy.world to c/technology@lemmy.world

21 comments fedilink hide all child comments

top 21 comments

sorted by: hot top controversial new old

[-] FaceDeer@fedia.io 41 points 1 year ago* (last edited 1 year ago)

The only way I can imagine this working is by twisting the definition of the words "search engine" enough that you can claim that there aren't search engines, but really there are still, just under a different name.

Search engines aren't actually the "problem" that OP is wanting to address, here, though. He just doesn't like the specific search engines that actually exist right now. What he should really be asking is how a search engine could be implemented that doesn't have the particular flaws that he's bothered by.

[-] elbarto777@lemmy.world 16 points 1 year ago

Plus the web is not the whole internet.

You could stick to Gopher.

Or use other search engines. There are hundreds. Hundreds.

Maybe not as useful as the dominant ones, though.

[-] MHSJenkins@infosec.pub 24 points 1 year ago

I have a very difficult time imagining an internet that is both interoperable and ranking-free. Now, that having been said, we are well outside my area of expertise here so I'd love to hear from folks who know more than me.

[-] boatswain@infosec.pub 5 points 1 year ago

What about just giving transparency to what the ranking is and letting people control it? Analogous to "sort by new/best/top" bit ideally with more knobs to tweak and a bunch of preset options?

[-] magic_lobster_party@kbin.run 8 points 1 year ago* (last edited 1 year ago)

Then it’s just more easily abused by SEO. “Best” according to who? Votes? Number of views? Page rank? All numbers can be manipulated.

[-] Valmond@lemmy.world 3 points 1 year ago

Yeah, please only include lightweight pages please, with short texts. For example.

[-] MHSJenkins@infosec.pub 1 points 1 year ago

That could work, I suppose, but I do wonder how much it would slow everything down.

[-] bazmatazable@reddthat.com 7 points 1 year ago

I had a similar idea: Could search engines be broken up and distributed instead of being just a couple of monoliths?

Reading the HN thread, the short answer is: NO.

Still, its fun to imagine what it might look like if only......

I think the OP is looking for an answer to the problem of Google having a monopoly that gives them the power to make it impossible to be challenged. The cost to replicate their search service is just so astronomical that its basically impossible to replace them. Would the OP be satisfied if we could make cheaper components that all fit together to make a competing but decentralized search service? Breaking down the technical problems is just the first step, the basic concepts for me are:

Crawling -> Indexing -> Storing/host index -> Ranking

All of them are expensive because the internet is massive! If each of these were isolated but still interoperable then we get some interesting possibilities: Basically you could have many smaller specialized companies that can focus on better ranking algorithms for example.

What if crawling was done by the owners of each website and then submitted to an index database of their choice? This flips the model around so things like robots.txt might become less relevant. Bad actors and spam however now don't need any SEO tricks to flood a database or mislead as to their actual content, they can just submit whatever they like!. These concerns feed into the next step:
What if there were standard indexing functions similar to how you have many standard hash functions. How a site is indexed plays an important role in how ranking will work (or not) later. You could have a handful of popular general purpose index algorithms that most sites would produce and then submit (e.g. keywords, images, podcasts, etc.) combined with many more domain specific indexing algorithms (e.g. product listings, travel data, mapping, research). Also if the functions were open standards then it would be possible for a browser to run the index function on the current page and compare the result to the submitted index listing. It could warn users that the page they are viewing is probably either spam or misconfigured in some way to make the index not match what was submitted.
What if the stored indexes were hosted in a distributed way similar to DNS? Sharing the database would lower individual costs. Companies with bigger budgets could replicate the database to provide their users with a faster service. Companies with fewer resources would be able to use the publicly available indexes yet still be competitive.
Enabling more competition between different ranking methods will hopefully reduce the effectiveness of SEO gaming (or maybe make it worse as the same content is repackaged for each and every index/rank combination). Ranking could happen locally (although this would probably not be efficient at all but that fact that it might even be possible at all is quite a novel thought)

Sigh enough daydreaming already........

[-] magic_lobster_party@kbin.run 5 points 1 year ago

Not sure how that can implemented, but I’m sure it will only lead to great amounts of SEO abuse. It only works if everybody are acting in good faith.

[-] Temperche@slrpnk.net 4 points 1 year ago

Would need human curation to select the best websites in each field.

[-] itsathursday@lemmy.world 5 points 1 year ago

Yahoo back in the day with its categories, and later Fazed.net with curated links was a nice time for a while

[-] TrumpetX@programming.dev 6 points 1 year ago

Pay to play was the problem there. I had the highest ranking joke page on webcrawler for a stint, but Yahoo wanted $500 to put me on top. My 15 year old self was not interested.

[-] PrinceWith999Enemies@lemmy.world 4 points 1 year ago

That’s pretty much what all of the site aggregators were. I ran a couple of communities on yahoo and some other sites. There were also services like Archie, gopher, and wais, and I am pretty sure my Usenet client had some searching on it (it might have been emacs - I can’t remember anymore). I remember when Google debuted on Stanford.edu/google and realized that everything was about to change.

[-] erwan@lemmy.ml 2 points 1 year ago

It worked because the web was much smaller.

[-] SlopppyEngineer@lemmy.world 1 points 1 year ago

Or AI to rank and filter out the things you need based on public indexing. Preferably there'd be several AI assistants to choose from. Things seem to be moving in that direction anyway.

[-] sem@lemmy.ml 11 points 1 year ago

The problem is that personalization of search results tends to information bubbles. That is the reason why I prefer DDG over Google.

[-] BearOfaTime@lemm.ee 0 points 1 year ago* (last edited 1 year ago)

While this is true (and a problem with current engines like Google), I could see having a local LLM doing the filtering for you based on your own criteria. Then you could do a wide-open search as needed, or with minimal filtering, etc.

When I'm searching for technical stuff (Android rom, Linux commands/how it works), it would be really helpful to have some really capable filtering mechanisms that have learned.

When I want to find something from a headline, then it needs to be mostly open (well, maybe filtering out The Weekly World News).

But it really needs to be done by my own instance of an LLM/AI, not something controlled elsewhere.

[-] _sideffect@lemmy.world 9 points 1 year ago

Ai won't help since it'll be programmed to show only what it's owners want us to see

[-] BearOfaTime@lemm.ee 4 points 1 year ago

With your own customization, done locally.

[-] chiisana@lemmy.chiisana.net 1 points 1 year ago

Given that the indices are not available locally, it’d be difficult for your own algorithm of any sort, AI or otherwise, to rank items higher/lower than others.

[-] readbeanicecream@kbin.earth 2 points 1 year ago

Couldn't metasearch engines like metager help with this?

this post was submitted on 19 May 2024

66 points (81.1% liked)

Technology

73839 readers

962 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws