Same, that's why I don't understand how this is supposed to stay a two-pizza team system
"Each team is full-stack and full-lifecycle: responsible for front-end, back-end, database, business analysis, feature prioritization, UX, testing, deployment, monitoring"
"But they also shouldn't be too large, ideally each one is a Two Pizza Team"
Either that's a team with some hugely diversified skills, or that's two car-sized pizzas
Yep try scrapy. And also it handles for you the concurrency of your pipelines items, configuration for every part,...
The huge feature of scrapy is it's pipelining system: you scrape a page, pass it to the filtering part, then to the deduplication part, then to the DB and so on
Hugely useful when you're scraping and extraction data, I reckon if you're only extracting raw pages then it's less useful I guess
Wats0ns
joined 1 year ago
Isn't that in purpose tho ? Like "hey if we're not sure to be able to break on time, just disengage so it's not our responsibility anymore"?