Browsed by
Author: Attila Tóth

News & Article Data Extraction: Open Source vs Closed Source Solutions

Article extraction is the process of extracting data fields from an article page and putting it into a machine-readable structured format like JSON. In many use cases, the article page that you want to extract is a news page but it can be any other type of article. Based on our experience in the web data extraction industry for over 10 years, the demand for structured article data is getting...

Real Estate: Use Web Data Extraction to Make Smarter Decisions

As the internet continues to grow, the amount of data it generates grows with it, opening new opportunities to improve processes and make more informed decisions. Real estate is one of the many industries that are being disrupted by data-related technologies and innovations. Whether you are a broker, realtor, investor, or property manager you have the potential to become data-driven and gain...

Data Center Proxies vs. Residential Proxies

In this blog post you are going to learn what’s the main difference between data center proxies and residential proxies. When to use data center and residential proxies in your web data extraction project to maximize successful requests

How to Get High Success Rates With Proxies: 3 Steps to Scale Up

In this article we give you some insight on how you can scale up your web data extraction project. You will learn what are the basic elements of scaling up and what are the steps that you should take when looking for the best rotating proxy solution.

Extracting Article & News Data: The Importance of Data Quality

Article and news data extraction is becoming increasingly popular and widely used by companies. Data quality plays a vital role in making sure these projects succeed. If the quality of the extracted articles is not good enough, your whole business could be at risk, especially if it depends on the constant flow of high quality article data.

Product Reviews API (Beta): Extract Product Reviews at Scale

We are excited to announce our next AutoExtract API: Product Reviews API (Beta). Using this API, you can get access to product reviews in a structured format, without writing site-specific code. You can use the Product Reviews API to extract product reviews from eCommerce sites at scale. Just make a request to the API and receive your data in real-time!

Vehicle API (Beta): Extract Automotive Data at Scale

Today we are delighted to launch a Beta of our newest data extraction API: AutoExtract Vehicle API. With this API you can collect structured data from web pages that contain automotive data such as classified or dealership sites. Using our API, you can get your data without writing site-specific code. If you need automotive/vehicle data, sign up now for a beta version of our Vehicle API.

Job Postings Beta API: Extract Job Postings at Scale

We’re excited to announce our newest data extraction API, Job Postings API. From now on, you can use AutoExtract to extract Job Postings data from many job boards and recruitment sites. Without writing any custom data extraction code!

How To Scrape The Web Without Getting Blocked

Web scraping is when you extract data from the web and put it in a structured format. Getting structured data from publicly available websites and pages should not be an issue as everybody with an internet connection can access these websites. You should be able to structure it as well. In reality though, it’s not that easy.

Try Crawlera For Free

Looking Back at 2019

2019 was an exciting year for Scrapinghub. We created things we have never created before and did things nobody in our industry had ever done before. Let’s revisit what happened in 2019!