504
You can do anything at Zombocom
(sopuli.xyz)
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
You have basically two options: treat HTML as a string or parse it then process it with higher level DOM features.
The problem with the second approach is that HTML may look like an XML dialect but it is actually immensely quirky and tolerant. Moreover the modern web page is crazy bloated, so mass processing pages might be surprisingly demanding. And in the end you still need to do custom code to grab the data you're after.
On the other hand string searching is as lightweight as it gets and you typically don't really need to care about document structure as a scraper anyways.
That makes a ton of sense. I hadn't thought about the page size yet. Thanks again.