32
submitted 3 days ago by vermaterc@lemmy.ml to c/firefox@lemmy.ml

Latest nightly builds of Firefox 139 include an experimental web link preview feature which shows (among other things) an AI-generated summary of what that page is purportedly about before you visit it, saving you time, a click, or the need to ‘hear’ a real human voice.

you are viewing a single comment's thread
view the rest of the comments
[-] algernon@lemmy.ml 0 points 3 days ago* (last edited 2 days ago)

I wonder if the preview does a pre-fetch which can be identified as such? As in, I wonder if I'd be able to serve garbage for the AI summarizer, but the regular content to normal views. Guess I'll have to check!

Update: It looks like it sends an X-Firefox-Ai: 1 header. Cool. I can catch that, and deal with it.

[-] ReversalHatchery@beehaw.org 1 points 2 days ago

User agents that /look/ legit, but the chances of someone using them /today/ is highly unlikely.

I hope I'll never need to deal with websites administered by you. This is way overboard.
Sajnos az esélye nem olyan kicsi mint azt szeretném.

[-] algernon@lemmy.ml 1 points 2 days ago

Overboard? Because I disallow AI summaries?

Or are you referring to my "try to detect sketchy user agents" ruleset? Because that had two false positives in the past two months, yet, those rules are responsible for stopping about 2.5 million requests per day, none of which were from a human (I'd know, human visitors have very different access patterns, even when they visit the maze).

If the bots were behaving correctly, and respected my robots.txt, I wouldn't need to fight them. But when they're DDoSing my sites from literally thousands of IPs, generating millions of requests a day, I will go to extreme lengths to make them go away.

[-] ReversalHatchery@beehaw.org 0 points 2 days ago

Overboard? Because I disallow AI summaries?

you disallow access to your website, including when the user agent is a little unusual. do you also only allow the last 1 major version of the official versions of major browsers?

yet, those rules are responsible for stopping about 2.5 million requests per day,

nepenthes. make them regret it

[-] algernon@lemmy.ml 1 points 2 days ago

you disallow access to your website

I do. Any legit visitor is free to roam around. I keep the baddies away, like if I were using a firewall. You do use a firewall, right?

when the user agent is a little unusual

Nope. I disallow them when the user agent is very obviously fake. Noone in 2025 is going to browse the web with "Firefox 3.8pre5", or "Mozilla/4.0", or a decade old Opera, or Microsoft Internet Explorer 5.0. None of those would be able to connect anyway, because they do not support modern TLS ciphers required. The rest are similarly unrealistic.

nepenthes. make them regret it

What do you think happens when a bad agent is caught by my rules? They end up in an infinite maze of garbage, much like the one generated by nepenthes. I use my own generator (iocaine), for reasons, but it is very similar to nepenthes. But... I'm puzzled now. Just a few lines above, you argued that I am disallowing access to my website, and now you're telling me to use an infinite maze of garbage to serve them instead?

That is precisely what I am doing.

By the way, nepenthes/iocaine/etc alone does not do jack shit against these sketchy agents. I can guide them into the maze, but as long as they can access content outside of it, they'll keep bombarding my backend, and will keep training on my work. There are two ways to stop them: passive identification, like my sketchy agents ruleset, or proof-of-work solutions like Anubis. Anubis has the huge downside that it is very disruptive to legit visitors. So I'm choosing the lesser evil.

load more comments (9 replies)
this post was submitted on 13 Apr 2025
32 points (77.6% liked)

Firefox

19393 readers
33 users here now

A place to discuss the news and latest developments on the open-source browser Firefox

founded 5 years ago
MODERATORS