It’s the end of an era. Python 2 is on its way out with only a few security and bug fixes forthcoming from now until its official retirement in 2020. Given this withdrawal of support and the fact that Python 3 has snazzier features, we are thrilled to announce that Scrapy Cloud now officially supports Python 3.
If you are new to Scrapinghub, Scrapy Cloud is our production platform that allows you to deploy, monitor, and scale your web scraping projects. It pairs with Scrapy, the open source web scraping framework, and Portia, our open source visual web scraper.
Scrapy + Scrapy Cloud with Python 3
I’m sure you Scrapy users are breathing a huge sigh of relief! While Scrapy with official Python 3 support has been around since May, you can now deploy your Scrapy spiders using the fancy new features introduced with Python 3 to Scrapy Cloud. You’ll have the beloved extended tuple unpacking, function annotations, keyword-only arguments and much more at your fingertips.
Fear not if you are a Python 2 developer and can’t port your spiders’ codebase to Python 3, because Scrapy Cloud will continue supporting Python 2. In fact, Python 2 remains the default unless you explicitly set your environment to Python 3.
Deploying your Python 3 Spiders
Docker support was one of the new features that came along with the Scrapy Cloud 2.0 release in May. It brings more flexibility to your spiders, allowing you to define in which kind of runtime environment (AKA stack) they will be executed.
This configuration is done in your local project’s
scrapinghub.yml. There you have to include a section called
scrapy:1.1-py3 as the stack for your Scrapy Cloud project:
projects: default: 99999 stacks: default: scrapy:1.1-py3
After doing that, you just have to deploy your project using shub:
$ shub deploy
Note: make sure you are using shub 2.3+ by upgrading it:
$ pip install shub --upgrade
And you’re all done! The next time you run your spiders on Scrapy Cloud, they will run on Scrapy 1.1 + Python 3.
Multi-target Deployment File
If you have a multi-target deployment file, you can define a separate stack for each project ID:
projects: default: id: 55555 stack: scrapy:1.1 py3: id: 99999 stack: scrapy:1.1-py3
This allows you to deploy your local project to whichever Scrapy Cloud project you want, using a different stack for each one:
$ shub deploy py3
This deploys your crawler to project 99999 and uses Scrapy 1.1 + Python 3 as the execution environment.
You can find different versions of the Scrapy stack here.
We hope that you’re as excited as we are for this newest upgrade to Python 3. If you have further questions or are interested in learning more about the souped up Scrapy Cloud, take a look at our Knowledge Base article.
For those new to our platform, Scrapy Cloud has a forever free subscription, so sign up and give us a try.