Browsed by
Category: autoscraping

Extracting clean article HTML with News API

The Internet offers a vast amount of written content in the form of articles, news, blog posts, stories, essays, tutorials that can be leveraged by many useful applications:

Job Postings Beta API: Extract Job Postings at Scale

We’re excited to announce our newest data extraction API, Job Postings API. From now on, you can use AutoExtract to extract Job Postings data from many job boards and recruitment sites. Without writing any custom data extraction code!

Announcing Portia, the Open Source Visual Web Scraper!

Note: Portia is no longer available for new users. It has been disabled for all the new organisations from August 20, 2018 onward.

We’re proud to announce the developer release of Portia, our new open source visual scraping tool based on Scrapy. Check out this video:

Introducing Dash

We're excited to introduce Dash, a major update to our scraping platform.

Spiders activity graphs

Today we are introducing a new feature called Spider activity graphs. These allow you to visualize quickly how your spiders are working, and it's a very useful tool for busy projects to find out which spiders are not working as expected.

Autoscraping casts a wider net

We have recently started letting more users into the private beta for our Autoscraping service. We're receiving a lot of applications following the shutdown of Needlebase and we're increasing our capacity to accommodate these users.