At Scrapinghub we're always building and running large crawls–last year we had 11 billion requests made on Scrapy Cloud alone. Crawling millions of pages from the internet requires more sophistication than getting a few contacts of a list, as we need to make sure that we get reliable data, up to date lists of item pages and are able to optimise our crawl as much as possible.
Note: Portia is no longer available for new users. It has been disabled for all the new organisations from August 20, 2018 onward.
It’s been several months since we first integrated Portia into our Scrapy Cloud platform, and last week we officially began to phase out Autoscraping in favor of Portia.