128
Mozilla Foundation Calls on Tech Industry to Block ICE Contractor
(www.404media.co)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
How exactly does a website stop a web scraper specifically from a org?
I mean isn't that the whole point of web scraping? That if it's publicly available, anybody, including people like ICE, will find a way to get the data?
Yeah, it's not technically impossible to stop web scrapers, but it's difficult to have a lasting, effective solution. One easy way is to block their user-agent assuming the scraper uses an identifiable user-agent, but that can be easily circumvented. The also easy and somewhat more effective way is to block scrapers' and caching services' IP addresses, but that turns into a game of whack-a-mole. You could also have a paywall or login to view content and not approve a certain org, but that only will work for certain use cases, and that also is easy to circumvent. If stopping a single org's scraping is the hill to die on, good luck.
That said, I'm all for fighting ICE, even if it's futile. Just slowing them down and frustrating them is useful.