Browsed by
Author: Ian Kerins

St Patrick’s Day Special: Finding Dublin’s Best Pint of Guinness With Web Scraping

St Patrick’s Day Special: Finding Dublin’s Best Pint of Guinness With Web Scraping

At Scrapinghub we are known for our ability to help companies make mission critical business decisions through the use of web scraped data.

But for anyone who enjoys a freshly poured pint of stout, there is one mission critical question that creates a debate like no other…

“Who serves the best pint of Guinness?”

Spidermon: Scrapinghub’s Secret Sauce To Our Data Quality & Reliability Guarantee

If you know anything about Scrapinghub, you know that we are obsessed with data quality and data reliability.

Outside of building some of the most powerful web scraping tools in the world, we also specialise in helping companies extract the data they need for their mission-critical business requirements. Most notably companies who:

  • Rely on web data to make critical business decisions, or;
  • ...

Proxy Management: Should I Build My Proxy Infrastructure In-House Or Use A Off-The-Shelf Proxy Solution?

Proxy management is the thorn in the side of most web scrapers. Without a robust and fully featured proxy infrastructure, you will often experience constant reliability issues and hours spent putting out proxy fires.

A situation no web scraping professional wants to deal with. Us web scrapers are interested in extracting and using web data, not managing proxies.

In this article, we’re going to...

A Sneak Peek Inside Crawlera: The World’s Smartest Web Scraping Proxy Network

“How does Scrapinghub Crawlera work?” is the most common question we get asked from customers who after struggling for months (or years) with constant proxy issues, only to have them disappear completely when they switch to Crawlera. 

Today we’re going to give you a behind the scenes look at Crawlera so you can see for yourself why it is the world’s smartest web scraping proxy network and the...

Why We Created Crawlera? The World’s Smartest Web Scraping Proxy Network

Let’s face it, managing your proxy pool can be an absolute pain and the biggest bottleneck to the reliability of your web scraping! 

Nothing annoys developers more than crawlers failing because their proxies are continuously being banned.

The Rise of Web Data in Hedge Fund Decision Making & The Importance of Data Quality

Over the past few years, there has been an explosion in the use of alternative data sources in investment decision making in hedge funds, investment banks and private equity firms.

These new data sources, collectively known as “alternative data”, have the potential to give firms a crucial informational edge in the market, enabling them to generate alpha.

The Challenges E-Commerce Retailers Face Managing Their Web Scraping Proxies

These days web scraping amongst the big e-commerce companies is ubiquitous due to the advantages data-based decision making can bring to remaining competitive in such a tight margin business.

E-commerce companies are increasingly using web data fuel their competitor research, dynamic pricing and new product research.

For these e-commerce sites, their most important consideration is: the ...

Looking Back at 2018

What a year 2018 has been for Scrapinghub!!

It’s hard to know where to start…

This year has seen tremendous growth at Scrapinghub, setting us up to have a great 2019.

Here are some of the highlights of 2018…

Shubber GetTogether 2018

It’s hard to believe our annual Shubber GetTogether is already over.

Data Quality Assurance for Enterprise Web Scraping

When it comes to web scraping, one key element is often overlooked until it becomes a big problem.

That is data quality.

Getting consistent high quality data when scraping the web is critical to the success of any web scraping project, particularly when scraping the web at scale or extracting mission critical data where accuracy is paramount.

Data quality can be the difference between a...