One common misconception about scraping personal data is that public personal data does not fall under the GDPR. Many businesses assume that because the data has already been made public on another website that it is fair game to scrape. In actuality, GDPR makes no blanket exceptions for public personal data and the same analysis for any other personal data must be conducted prior to scraping...
In the fifth and final post of this solution architecture series, we will share with you how we architect a web scraping solution, all the core components of a well-optimized solution, and the resources required to execute it.
In the fourth post of this solution architecture series, we will share with you our step-by-step process for evaluating the technical feasibility of a web scraping project.
Visual web scraping tools are great. They allow people with little to no technical know-how to extract data from websites with only a couple hours of upskilling, making them great for simple lead generation, market intelligence and competitor monitoring projects. Removing countless hours of manual entry work for sales and marketing teams, researchers, and business intelligence team in the...
In this the third post in our solution architecture series, we will share with you our step-by-step process for conducting a legal review of every web scraping project we work on.
To accurately extract data from a web page, developers usually need to develop custom code for each website. This is manageable and recommended for tens or hundreds of websites and where data quality is of the utmost importance, but if you need to extract data from thousands of sites, or rapidly extract data from sites that are not yet covered by pre-existing code, this is often an...
Today, we’re delighted to announce the launch of the beta program for Scrapinghub’s new AI powered developer data extraction API for automated product and article extraction.
After much development and refinement with alpha users, our team have refined this machine learning technology to the point that data extraction engine is capable of automatically identifying common items on product and...
In this the second post in our solution architecture series, we will share with you our step-by-step process for data extraction requirement gathering.
For many people (especially non-techies), trying to architect a web scraping solution for their needs and estimate the resources required to develop it, can be a tricky process.
Oftentimes, this is their first web scraping project and as a result have little reference experience to draw upon when investigating the feasibility of a data extraction project.
In this series of articles we’re going...
St Patrick’s Day Special: Finding Dublin’s Best Pint of Guinness With Web Scraping
At Scrapinghub we are known for our ability to help companies make mission critical business decisions through the use of web scraped data.
But for anyone who enjoys a freshly poured pint of stout, there is one mission critical question that creates a debate like no other…
“Who serves the best pint of Guinness?”