Scrapy & AutoExtract API integration

We’ve just released a new open-source Scrapy middleware which makes it easy to integrate AutoExtract into your existing Scrapy spider. If you haven’t heard about AutoExtract yet, it’s an AI-based web scraping tool which automatically extracts data from web pages without the need to write any code. Learn more about AutoExtract here.

Web Scraping Questions & Answers Part I

As you know we held the first ever Web Data Extraction Summit last month. During the talks, we had a lot of questions from the audience. We have divided the questions into two parts - in the first part, we will cover questions on Web Scraping at Scale - Proxy and Anti-Ban Best Practice, and Legal Compliance, GDPR in the World of Web Scraping. Enjoy! You can also check out the full talks on...

Price intelligence with Python: Scrapy, SQL and Pandas

In this article I will guide you through a web scraping and data visualization project. We will extract e-commerce data from real e-commerce websites then try to get some insights out of it. The goal of this article is to show you how to get product pricing data from the web and what are some ways to analyze pricing data. We will also look at how price intelligence makes a real difference for...

The Web Data Extraction Summit 2019

The Web Data Extraction Summit was held last week, on 17th September, in Dublin, Ireland. This was the first-ever event dedicated to web scraping and data extraction. We had over 140 curious attendees, 16 great speakers from technical deep dives to business use cases, 12 amazing presentations, a customer panel discussion and unlimited Guinness.

News Data Extraction at Scale with AI powered AutoExtract

A huge portion of the internet is news. It’s a very important type of content because there are always things happening either in our local area or globally that we want to know about. The amount of news published everyday on different sites is ridiculous. Sometimes it’s good news and sometimes it’s bad news but one thing’s for sure: it’s humanly impossible to read all of it everyday.

Gain a Competitive Edge with Product Data

Product data - whether from e-commerce sites, auto listings or product reviews, offers a treasure trove of insights that can give your business an immense competitive edge in your market. Getting access to this data in a structured format can unleash new potential for not only business intelligence teams, but also their counterparts in marketing, sales, and management that rely on accurate...

Four Use Cases for Online Public Sentiment Data

The manual method of discovery for gauging online public sentiment towards a product, company, or industry is cursory at best, and at worst, may harm your business by providing incorrect or misleading insights.

The First-Ever Web Data Extraction Summit!

The range of use cases for web data extraction is rapidly increasing and with it the necessary investment. Plus the number of websites continues to grow rapidly and is expected to exceed 2 billion by 2020.

How to use proxies with Python Requests module

Sending HTTP requests in Python is not necessarily easy. We have built-in modules like urllib, urllib2 to deal with HTTP requests. Also, we have third-party tools like Requests. Many developers use Requests because it is high level and designed to make it extremely easy to send HTTP requests.

How to set up a custom proxy in Scrapy?

When scraping the web at a reasonable scale, you can come across a series of problems and challenges. You may want to access a website from a specific country/region. Or maybe you want to work around anti-bot solutions. Whatever the case, to overcome these obstacles you need to use and manage proxies. In this article, I'm going to cover how to set up a custom proxy inside your Scrapy spider in...

Sign up now

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more

Welcome

Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Follow Us

Learn More

Recent Posts