121
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 02 Oct 2025
121 points (96.2% liked)
Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ
64615 readers
314 users here now
⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.
Rules • Full Version
1. Posts must be related to the discussion of digital piracy
2. Don't request invites, trade, sell, or self-promote
3. Don't request or link to specific pirated titles, including DMs
4. Don't submit low-quality posts, be entitled, or harass others
Loot, Pillage, & Plunder
📜 c/Piracy Wiki (Community Edition):
🏴☠️ Other communities
FUCK ADOBE!
Torrenting/P2P:
- !seedboxes@lemmy.dbzer0.com
- !trackers@lemmy.dbzer0.com
- !qbittorrent@lemmy.dbzer0.com
- !libretorrent@lemmy.dbzer0.com
- !soulseek@lemmy.dbzer0.com
Gaming:
- !steamdeckpirates@lemmy.dbzer0.com
- !newyuzupiracy@lemmy.dbzer0.com
- !switchpirates@lemmy.dbzer0.com
- !3dspiracy@lemmy.dbzer0.com
- !retropirates@lemmy.dbzer0.com
💰 Please help cover server costs.
![]() |
![]() |
---|---|
Ko-fi | Liberapay |
founded 2 years ago
MODERATORS
It's ok: Google and all other ad-supported search is about to go the way of the dinosaur as soon as local AI search catches on. When your own PC runs a search for you, it basically googles on your behalf and you never see those ads.
It's going to change everything.
It's not going to change everything. Why would you ever use an LLM for anything information related ever? I can make up wrong answers just as fast as it can.
I really hope that this is a joke and I'm making a fool of myself.
Google search: "scientific articles about (whatever)" Then you get tons of ads and irrelevant results.
LLM search: "Find me scientific articles about (whatever)" Then you get just the titles and links (with maybe a short summary).
It's 100% better and you don't have to worry about hallucinations since you it's wasn't actually trying to find an answer... Just helping you perform a search.
You're joking right? "making up answers" in the case of search results just means a dead link. If you get a good link 99% of the time and don't have to use an enshitified service, that's good enough for 99% of people. Try again is the worst case scenario.
Finding search terms is the one task I consistently use LLMs for. They did not say that though, they said replacing traditional search with LLMs, that traditional search is about to "go the way of the dinosaur". I dont trust any local LLM to accurately recall anything it read.
Not to mention that once we gain dependence on LLMs, which is something big tech is trying really hard to achieve right now, it will not be all that difficult for the creators to introduce biases that give us many of the same problems as search engines. Product placement, political censorship, etc. There would not be billions of dollars in investment if they thought they weren't going to get anything out of it.
(the best) Local LLMs are FOSS though, if bias is introduced it can be detected and the user base can shift away to another version, unlike centralized cloud LLMs that are private silos.
I also don't think LLMs of any kind will fully replace search engines, but I do think they will be one of a suite of ML tools that will enable running efficient local (or distributed) indexing and search of the web.
First of all, they are not FOSS. I know it seems tangential to the discussion, but it's important because biases cannot be reliably detected without the starting data. You should also not trust humans to see bias because humans themselves are quite biased and will generally assume that the LLM is behaving correctly if it aligns with their biases, which can be shifted in various ways over time, too.
Second, local LLMs don't have the benefit of free software where we can modify them freely or make forks if there are problems. Sure, there's fine tuning, but you don't get full control that way, and you need access to your own tuning data set. We would really just have the option to switch products, which doesn't put us much further ahead than using the closed off products available online.
I'm all for adding them to the arsenal of tools, but they are deceptively difficult to use correctly, which makes it so hard for me to be excited about them. I hardly see anyone using these tools for the purposes they are actually good for, and the things they are good for are also deceptively limited.
Yeah, no thanks. I'll pass.
Someone would have to pay for the API calls though. And that tends to mean either pay a subscription or view ads. There's no technical reason your local LLM couldn't call a search engine's API to give you an ad-free search experience, and in fact you don't need an LLM to run a local ad-free search frontend. But there is a commercial reason, namely that whoever runs the search engine API will want payment. It would be some progress to have an ad-free search subscription, but it wouldn't get around all the megacorp fuckery that decides what search results you get.
APIs are the compromise that sites have to make if they dont want the much more reasource heavy scrapping methods used.
The most they could do is rate limit IP addresses, and that doesbt work too well when jts individual users who can just request a new IP any time
Not to mention that the scraped indexes can and should be shared. Unfortunately what OP is seeing may be a move to thwart this type of brute force scraping, and might resolve as dynamically assigned domain addresses, where the URL of a set object is temporarily assigned and streamed only to a single or group of IP addresses that request it within a given timeframe before being rotated out until found in search again and then reassigned a new URL, etc. This is a frankly stupid use of resources, but can effectively be used to prevent crowdsourced indexes from proliferating, and to punish IPs or even MAC addresses or browser fingerprints associated with downloading and reuploading videos which almost certainly have stegnographic fingerprinting embedded that associate with who the video was served up to at the time it was downloaded.
Also, you know what would make this all even worse? Laws requiring that people prove their identity in order to consume content or pull videos... just like age verification laws now being passed in several countries. What a coincidence.
I agree with local search, but I prefer more of a traditional algorithm-based search to generative AI. A solution I've seen (that is far more attainable than building your own search engine) is hosting a metasearch engine, which collates results from search engines, based on your own preferences of results. Or perhaps using someone else's established server if their preferences align with yours. Localised (on-device) search will be a gamechanger in many ways, but I believe a meaningful version of that is far off and potentially impractical to implement.