Browsed by
Category: open-source-2

Building spiders made easy: GUI For Your Scrapy Shell

As a python developer at Scrapinghub, I spend a lot of time in the Scrapy shell. This is a command-line interface that comes with Scrapy and allows you to run simple, spider compatible code. It gets the job done, sure, but there’s a point where a command-line interface can become kinda fiddly and I found I was passing that point pretty regularly. I have some background in tool design and task...

What I Learned as a Google Summer of Code student at Scrapinghub

Google Summer of Code (GSoC) was such a great experience for students like me. I learned so much about open source communities as well as contributing to their complex projects. I also learned a great deal from my mentors, Konstantin and Cathal, about programming and software engineering practices. In my opinion, the most valuable lesson I got from GSoC was what it was like to be a Software...

Do Androids Dream of Electric Sheep?

It got very easy to do Machine Learning: you install a ML library like scikit-learn or xgboost, choose an estimator, feed it some training data, and get a model which can be used for predictions.

Improved Frontera: Web Crawling at Scale with Python 3 Support

Python is our go-to language of choice and Python 2 is losing traction. In order to survive, older programs need to be Python 3 compatible.

This Month in Open Source at Scrapinghub August 2016

Welcome to This Month in Open Source at Scrapinghub! In this regular column, we share all the latest updates on our open source projects including Scrapy, Splash, Portia, and Frontera.

This Month in Open Source at Scrapinghub June 2016

Welcome to This Month in Open Source at Scrapinghub! In this regular column, we share all the latest updates on our open source projects including Scrapy, Splash, Portia, and Frontera.

This Month in Open Source at Scrapinghub March 2016

Welcome to This Month in Open Source at Scrapinghub! In this monthly column, we share all the latest updates on our open source projects including Scrapy, Splash, Portia, and Frontera.

Portia: The Open Source Alternative to Kimono Labs

Note: Portia is no longer available for new users. It has been disabled for all the new organisations from August 20, 2018 onward.

Scrapy on the Road to Python 3 Support

Scrapy is one of the few popular Python packages (almost 10k github stars) that's not yet compatible with Python 3. The team and community around it are working to make it compatible as soon as possible. Here's an overview of what has been happening so far.

The Road to Loading JavaScript in Portia

Note: Portia is no longer available for new users. It has been disabled for all the new organisations from August 20, 2018 onward.

Support for JavaScript has been a much requested feature ever since Portia’s first release 2 years ago. The wait is nearly over and we are happy to inform you that we will be launching these changes in the very near future. If you’re feeling adventurous you can try it...