675

Google Search is losing its 'cached' web page feature (www.engadget.com)

submitted 2 years ago by d3Xt3r@lemmy.nz to c/technology@lemmy.world

147 comments fedilink hide all child comments

One of Google Search's oldest and best-known features, cache links, are being retired. Best known by the "Cached" button, those are a snapshot of a web page the last time Google indexed it. However, according to Google, they're no longer required.

"It was meant for helping people access pages when way back, you often couldn’t depend on a page loading,” Google's Danny Sullivan wrote. “These days, things have greatly improved. So, it was decided to retire it."

you are viewing a single comment's thread
view the rest of the comments

[-] Raiderkev@lemmy.world 112 points 2 years ago

Without getting into too much detail, a cached site saved my ass in a court case. Fuck you Google.

[-] lud@lemm.ee 14 points 2 years ago

It sucks because it's sometimes (but not very often) useful but it's not like they are under any obligation to support it or are getting any money from doing it.

[-] modus@lemmy.world 8 points 2 years ago

Isn't caching how anti-paywall sites like 12ft.io work?

[-] megaman@discuss.tchncs.de 8 points 2 years ago

At least some of these tools change their "user agent" to be whatever google's crawler is.

When you browse in, say, Firefox, one of the headers that firefox sends to the website is "I am using Firefox" which might affect how the website should display to you or let the admin knkw they need firefox compatibility (or be used to fingerprint you...).

You can just lie on that, though. Some privacy tools will change it to Chrome, since that's the most common.

Or, you say "i am the google web crawler", which they let past the paywall so it can be added to google.

[-] sfgifz@lemmy.world 2 points 2 years ago* (last edited 2 years ago)

Or, you say "i am the google web crawler", which they let past the paywall so it can be added to google.

If I'm not wrong, Google has a set range of IP addresses for their crawlers, so not all sites will let you through just because your UA claims to be Googlebot

[-] lud@lemm.ee 5 points 2 years ago

I dunno, but I suspect that they aren't using Google's cache if that's the case.

My guess is that the site uses its own scrapper that acts like a search engine and because websites want to be seen to search engines they allow them to see everything. This is just my guess, so it might very well be completely wrong.

[-] Tangent5280@lemmy.world 9 points 2 years ago

Would you be willing to share more? It's fine if you don't want to, I wouldn't either.

[-] Raiderkev@lemmy.world 24 points 2 years ago

No, it was pretty personal, and also a legal matter, so I gotta take the high road.

[-] verity_kindle@sh.itjust.works 3 points 2 years ago

Respect for your discretion.

[-] Flax_vert@feddit.uk 5 points 2 years ago

Need the tea!!!

[-] drislands@lemmy.world 4 points 2 years ago

Was that not something the Wayback Machine could have solved?

[-] icedterminal@lemmy.world 10 points 2 years ago

Depends. Not every site, or its pages, will be crawled by the Internet Archive. Many pages are available only because someone has submitted it to be archived. Whereas Google search will typically cache after indexed.

this post was submitted on 02 Feb 2024

675 points (99.1% liked)

Technology

73290 readers

1414 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws