Dirbot - a new example Scrapy project

Scrapy users have complained in the past about the lack of a pre-built example project that contains, for example, the dmoz spider described in the tutorial.

Complain no more!. We're happy to let you know that there is a now functional Scrapy project available on Github which contains the old Google Directory spider and the Dmoz spider described in the tutorial.

The project is called "dirbot", and it's available at https://github.com/scrapy/dirbot

The documentation of Scrapy 0.13 (which will become the next stable release, Scrapy 0.14) has been updated to point to this new example project.

September 12, 2018 In "Scrapy" , "Scurl" , "Open source" , "GSoC"
July 07, 2017 In "infinite scroll" , "python" , "Scrapy" , "scrapy" , "web crawling" , "Web Scraping"
April 19, 2017 In "deploy" , "github" , "Releases" , "Scrapinghub" , "Scrapy" , "scrapy" , "Scrapy Cloud" , "scrapy cloud"
Scrapy

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more

Welcome

Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Learn More

Recent Posts