Introducing ScrapyRT: An API for Scrapy spiders

We’re proud to announce our new open source project, ScrapyRT! ScrapyRT, short for Scrapy Real Time, allows you to extract data from a single web page via an API using your existing Scrapy spiders.

Why did we start this project?

We needed to be able to retrieve the latest data for a previously scraped page, on demand. ScrapyRT made this easy by allowing us to reuse our spider logic to extract data from a single page, rather than running the whole crawl again.

How does ScrapyRT work?

ScrapyRT runs as a web service and retrieving data is as simple as making a request with the URL you want to extract data from and the name of the spider you would like to use.

Let’s say you were running ScrapyRT on localhost, you could make a request like this:

http://localhost:9080/crawl.json?spider_name=foo&url=http://example.com/product/1

ScrapyRT will schedule a request in Scrapy for the URL specified and use the ‘foo’ spider’s parse method as a callback. The data extracted from the page will be serialized into JSON and returned in the response body. If the spider specified doesn’t exist, a 404 will be returned. The majority of Scrapy spiders will be compatible without any additional programming necessary.

How do I use ScrapyRT in my Scrapy project?

 > git clone https://github.com/scrapinghub/scrapyrt.git
 > cd scrapyrt
 > pip install -r requirements.txt
 > python setup.py install
 > cd ~/your-scrapy-project
 > scrapyrt

ScrapyRT will be running on port 9080, and you can schedule your spiders per the example shown earlier.

We hope you find ScrapyRT useful and look forward to hearing your feedback!

Comment here or discuss on HackerNews.

November 22, 2018 In "Scrapinghub" , "Lisbon 2018"
September 12, 2018 In "Scrapy" , "Scurl" , "Open source" , "GSoC"
June 19, 2018 In "Scrapinghub"
Open source, Releases, Scrapinghub, Scrapy

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more

Welcome

Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Learn More

Recent Posts