Scrapy Cloud Secrets: Hub Crawl Frontier and How To Use It

Imagine a long crawling process, like extracting data from a website for a whole month. We can start it and leave it running until we get the results. Though, we can agree that a whole month is plenty of time for something to go wrong. The target website can go down for a few minutes/hours, there can be some sort of power outage in your crawling server or even some other internet connection...

Blog Comments API (BETA): Extract Blog Comment DATA At Scale

A reliable and scalable way to tap into blog comment  driven insights

We are excited to announce our newest data extraction API. The Blog Comments API is now publicly available as a BETA release.

Your Price Intelligence Questions Answered

What is Price Intelligence?

Price Intelligence is leveraging web data to make better pricing, marketing, and business decisions. Basically, it is all about making use of the available data to optimize your pricing strategy, making it more competitive, increasing profitability, and ultimately, improving your business performance.

Data Center Proxies vs. Residential Proxies

In this blog post you are going to learn what’s the main difference between data center proxies and residential proxies. When to use data center and residential proxies in your web data extraction project to maximize successful requests

How to Get High Success Rates With Proxies: 3 Steps to Scale Up

In this article we give you some insight on how you can scale up your web data extraction project. You will learn what are the basic elements of scaling up and what are the steps that you should take when looking for the best rotating proxy solution.

Job Postings API: Stable release

Hassle-Free, Structured, Machine-ReadableJob Postings Data 

We are excited to announce our newest data extraction API. The Job Postings API is now out of BETA and publicly available as a stable release. 

Web Scraping Basics: A Developer’s Guide To Reliably Extract Data

The web is complex and constantly changing. It is one of the reasons why web data extraction can be difficult, especially in the long term. It’s necessary to understand how a website works really well, before you try to extract data. Luckily, there are lots of inspection and code tools available for this and in this article we will show you some of our favorites.

Extracting Article & News Data: The Importance of Data Quality

Article and news data extraction is becoming increasingly popular and widely used by companies. Data quality plays a vital role in making sure these projects succeed. If the quality of the extracted articles is not good enough, your whole business could be at risk, especially if it depends on the constant flow of high quality article data.

Price Gouging or Economics at Work: Price Intelligence to Track Consumer Sentiment

As the COVID-19 pandemic took hold, we at Scrapinghub began to wonder how it would impact on the data we crawl, and whether that data could tell us something useful about the pandemic and its impact.

A Practical Guide to Web Data QA Part III: Holistic Data Validation Techniques

In case you missed them, here’s the first part and second part of the series.