Browsed by
Category: spider

Building spiders made easy: GUI For Your Scrapy Shell

As a python developer at Scrapinghub, I spend a lot of time in the Scrapy shell. This is a command-line interface that comes with Scrapy and allows you to run simple, spider compatible code. It gets the job done, sure, but there’s a point where a command-line interface can become kinda fiddly and I found I was passing that point pretty regularly. I have some background in tool design and task...

Meet Spidermon: Scrapinghub’s Battle Tested Spider Monitoring Library [Now Open Sourced]

Your spider is developed and we are getting our structured data daily, so our job is done, right?

Absolutely not! Website changes (sometimes very subtly), anti-bot countermeasures and temporary problems often reduce the quality and reliability of our data.