How to use Crawlera with Scrapy

Crawlera is a proxy service, specifically designed for web scraping. In this article, you are going to learn how to use Crawlera inside your Scrapy spider.

How Crawlera works

Crawlera is a smart HTTP/HTTPS downloader. When you make requests using Crawlera it routes them through a pool of IP addresses. When necessary, it automatically introduces delays between requests and discards IP addresses to avoid anti-crawling measures. And simply like that, it makes a successful request hassle-free.

Crawlera with Scrapy

In order to use Crawlera you need to have an account with Crawlera subscription. If you haven’t signed up yet you can sign up here, it’s free. When you subscribe to a plan you will get an API key. You will need to use this API key in your Scrapy project to use Crawlera.

Install Crawlera middleware

First thing you need to do is to install the Crawlera middleware:

pip install scrapy-crawlera

Scrapy settings

Next, add these lines to the project settings:

# enable the middleware
DOWNLOADER_MIDDLEWARES = {'scrapy_crawlera.CrawleraMiddleware': 610}

# enable crawlera
CRAWLERA_ENABLED = True

# the APIkey you get with your subscription
CRAWLERA_APIKEY = '<your_crawlera_apikey>'

Crawlera settings

By using the middleware you add crawlera-specific settings to your project that you can configure. These settings can be overridden in Scrapy settings. For example it’s recommended to set these:

  • disable the Auto Throttle addon
  • increase the maximum number of concurrent requests
  • increase the download timeout
AUTOTHROTTLE_ENABLED = False
CONCURRENT_REQUESTS = 32
CONCURRENT_REQUESTS_PER_DOMAIN = 32
DOWNLOAD_TIMEOUT = 600

If you want to learn more about the usage of Crawlera, go to the support page or check out the FAQ.

November 07, 2019 In "Scrapy" , "SQL" , "Real Estate" , "Matlab"
October 17, 2019 In "Scrapinghub" , "Machine Learning" , "Extract Summit" , "Web Data Extraction Summit" , "AI"
October 15, 2019 In "Scrapy" , "AutoExtract"
Crawlera, Scrapy, Scrapinghub
Sign up now

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more

Welcome

Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Follow Us

Learn More

Recent Posts