Transitioning to Remote Working as a Company

I’d like to echo Joel Gasgoine’s sentiments: This is not normal remote working!

Like Buffer, we’ve been a remote-first company for almost 10 years and we’re also adjusting to the new normal as a result of COVID-19.

A Practical Guide to Web Data QA Part I: Validation Techniques

When it comes to web scraping at scale, there’s a set of challenges you need to overcome to extract the data. But once you are able to get it, you still have work to do. You need to have a data QA process in place. Data quality becomes especially crucial if you’re extracting high volumes of data from the web regularly and your team’s success depends on the quality of the scraped data.

COVID-19: Handling the Situation as a Fully Remote Company

Scrapinghub is a fully distributed organization with a remote workforce spread across the globe. This structure will enable us to continue to operate at full capacity during the Coronavirus pandemic and deliver full service to our customers.

Extracting clean article HTML with News API

The Internet offers a vast amount of written content in the form of articles, news, blog posts, stories, essays, tutorials that can be leveraged by many useful applications:

Job Postings Beta API: Extract Job Postings at Scale

We’re excited to announce our newest data extraction API, Job Postings API. From now on, you can use AutoExtract to extract Job Postings data from many job boards and recruitment sites. Without writing any custom data extraction code!

Building spiders made easy: GUI For Your Scrapy Shell

As a python developer at Scrapinghub, I spend a lot of time in the Scrapy shell. This is a command-line interface that comes with Scrapy and allows you to run simple, spider compatible code. It gets the job done, sure, but there’s a point where a command-line interface can become kinda fiddly and I found I was passing that point pretty regularly. I have some background in tool design and task...

How To Scrape The Web Without Getting Blocked

Web scraping is when you extract data from the web and put it in a structured format. Getting structured data from publicly available websites and pages should not be an issue as everybody with an internet connection can access these websites. You should be able to structure it as well. In reality though, it’s not that easy.

News Web Data Extraction to Predict Irish Election Results

Can pre-election news coverage of political parties predict the trend of the elections?

On February 9th, 2020, Ireland elected a new parliament. Prior to the elections, the political parties invested a lot of time, money and energy to get their political message to the people. A lot of research goes into selecting the right platform and the right medium.

Introducing Crawlera free trials & new plans

One of the biggest pain points we’ve heard from our Crawlera customers last year is the inconvenience of having to jump from one Crawlera plan to another, when more requests are needed in a month. For this reason, we have been working on rethinking our Crawlera plans to better accommodate these cases and be more flexible with customers that have variable crawling requirements from month to...

Best Recruitment Tips: How to Scout Top Talent

Attracting top talent is essential for the success and growth of a company. The majority of employers will agree that finding the best talent is just as hard as it is important. Which is why, rather than waiting for the right candidate to magically fall into your lap, it's time for employers to turn towards the untapped power of web scraped recruitment data.