Perhaps https://github.com/searxng/searxng ?
It looks like a few people are recommending this, so just a quick note in case people are unaware:
If you want to avoid being tracked, this is not a good solution. Searxng is a meta search engine, meaning it is effectively a proxy: you search on Searxng, it searches multiple sites and sends all the results back to you. If you use a public instance, you may be protected from the actual search engine*, because many people will use the same instance, and your queries will be mixed in with all of them. If you self host, however, all the searches will be your own - there is then no difference between using Searxng and just going to the site yourself.
*The caveat with using the public instances is while you may be protected from the upstream engine, you have to trust the admins - nothing stops them from tracking you themselves (or passing your data on).
Despite the claims in their docs, I would not consider this a privacy tool. If you are just looking for a good search engine, this may work, and it gives you flexibility and power to tune it yourself. But it's probably not going to do anything good for your privacy, above and beyond what you can get from other meta search engines like Startpage and DuckDuckGo, or other "private" search engines like Brave.
OP isn't asking for a secure search engine though, they're asking for one without ads that they can control themselves. Also while searxng and other meta search engines won't neccesarily protect you from data harvesting they will protect you from tracking cookies and the absolute trash mountain of fake results (imo especially noticeable with google search)
google's results got so bad recently I had to turn it off in my searxng instance
Use Yandex.ru, if you are looking for free access to the content in English. https://www.reddit.com/r/Piracy/comments/nd7w7s/lpt_if_you_cant_find_a_torrent_via_google_because/
They are explicitly trying to move away from Google, and are looking for a new option because their current solution is forcing them to turn off ad-blocking. Sounds to me like they are looking for a private option. Plus, given the forum in which we are having the discussion (Lemmy), even if OP is not specifically concerned with privacy, it seems likely other users are.
As for cookies, searxng can't do any more than your browser (possibly with extensions) can do, and relying on your browser here is a much better solution, because it protects you on all sites, rather than just on your chosen search engine.
"Trash mountain" results is a whole separate issue - you can certainly tune the results to your liking. But literally the second sentence of their GitHub headline is touting no tracking or profiling, so it seems worth bringing attention to the limitations, and that's all I'm trying to do here.
I'm not an expert but one could funnel all web traffic through a VPN if they needed right? Gaining possibly even more obscurity and shifting the trust to a company vs a small user
(relative whether that's an upgrade or not in privacy)
You mean between their instance and the final search engines? Or between them and a public instance of searxng?
In either case, I'm not sure it buys you anything in terms of privacy you wouldn't get by using the VPN and going directly to the search engines.
You're partially right about self hosting, but it still strips out the user tracking scripts and only provides the pure results, and you can make SearXNG route to Tor..
I noted in another comment that SearXNG can't do anything about the trackers that your browser can't do, and solving this at the browser level is a much better solution, because it protects you everywhere, rather than just on the search engine.
Routing over Tor is similar. Yes, you can route the search from your SearXNG instance to Google (or whatever upstream engine) over Tor, and hide your identity from Google. But then you click a link, and your IP connects to the IP of whatever site the results link to, and your ISP sees that. Knowing where you land can tell your ISP a lot about what you searched for. And the site you connected to knows your IP, so they get even more information - they know every action you took on the site, and everything you viewed. If you want to protect all of that, you should just use Tor on your computer, and protect every connection.
This is the same argument for using Signal vs WhatsApp - yes, in WhatsApp the conversation may be E2E encrypted, but the metadata about who you're chatting with, for how long, etc is all still very valuable to Meta.
To reiterate/clarify what I've said elsewhere, I'm not making the case that people shouldn't use SearXNG at all, only that their privacy claims are overstated, and if your goal is privacy, all the levels of security you would apply to SearXNG should be applied at your device level: Use a browser/extension to block trackers, use Tor to protect all your traffic, etc.
Seconding this. I use it and it works fairly well.
I'm really happy with my searxng instance
I had no idea that was what that was. Learn something new every day.
Like others, I use searxng.
But you can also try whoogle and librex
I am using searxng
You can customize a lot and the results are good imo 🌞
Now i need to figure how I need to make it public/accesible so I can use it when I'm out and about.
Easiest way if it's only for yourself is using tailscale
Set up a VPN. Safest / best way to do it
Search engines take a LOT of work to run, which is why there's so few of them. You can self-host a search engine that indexes one site, but not one that indexes the entire internet lol. The closest you'll find is SearxNG as others mentioned. It's not a search engine itself though; it just uses other search engines.
Yacy is pretty great.
Yes, Yacy is what you want OP (https://yacy.net). It's rather pathetic that people are still trying to be a parasite, but wanting to do so anonymously. Roll up your sleaves and commit your resources to making community search engines work. You have the control.
Huh…so there’s currently no open source search engine out there? I see a few crawlers, and some UIs the crawlers can use but no one project consolidating the two.
Instead of a 'normal' search engine, you could take a look at a Gpt like replacement, maybe there is one that also protects you your privacy, and it can certainly be used to find what normal search engines could find
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!