Browsed by
Category: performance

Introducing Dash

We're excited to introduce Dash, a major update to our scraping platform.

Why MongoDB Is a Bad Choice for Storing Our Scraped Data

MongoDB was used early on at Scrapinghub to store scraped data because it's convenient. Scraped data is represented as (possibly nested) records which can be serialized to JSON. The schema is not known ahead of time and may change from one job to the next. We need to support browsing, querying and downloading the stored data. This was very easy to implement using MongoDB (easier than the...