Dirbot - a new example Scrapy project

Scrapy users have complained in the past about the lack of a pre-built example project that contains, for example, the dmoz spider described in the tutorial.

Complain no more!. We're happy to let you know that there is a now functional Scrapy project available on Github which contains the old Google Directory spider and the Dmoz spider described in the tutorial.

The project is called "dirbot", and it's available at https://github.com/scrapy/dirbot

The documentation of Scrapy 0.13 (which will become the next stable release, Scrapy 0.14) has been updated to point to this new example project.

September 12, 2018 In "Open source" , "Scrapy" , "GSoC" , "Scurl"
July 07, 2017 In "Web Scraping" , "Scrapy" , "python" , "scrapy" , "web crawling" , "infinite scroll"
April 19, 2017 In "Releases" , "Scrapy" , "Scrapinghub" , "scrapy" , "Scrapy Cloud" , "scrapy cloud" , "deploy" , "github"
Scrapy
Extract Summit Blog banner - 1200x250
Sign up now

Web Data Extraction Summit 2019

presented by Scrapinghub

Dublin, Ireland 
17th September 2019

EARLY BIRD TICKETS

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more

Welcome

Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Follow Us

Learn More

Recent Posts