Introducing Data Reviews

One of the things that takes more time when building a spider is reviewing the scraped data and making sure it conforms to the requirements and expectations of your client or team. This process is so time consuming that, in many cases, it ends up taking more time than writing the spider code itself, depending on how well the requirements are written. To make this process more efficient we have introduced the ability to comment data directly on Dash (Scrapinghub UI), right next to the data, instead of relying on other channels (like issue trackers, emails or chat).




With this new feature you can discuss problems with data right where they appear without having to copy/paste data around, and have a conversation with your client or team until the issue is resolved. This reduces the time spent on data QA, making the whole process more productive and rewarding.

So go ahead, start adding comments to your data (you can comment whole items or individual fields) and let the conversation flow around data! You can mark resolved issues by archiving comments, and you will see jobs with unresolved (unarchived) comments directly on the Jobs Dashboard.

Last, but not least, you have the Data Reviews API to insert comments programmatically. This is useful, for example, to report problems in post-processing scripts that analyze the scraped data.

Happy scraping!

November 22, 2018 In "Scrapinghub" , "Lisbon 2018"
June 19, 2018 In "Scrapinghub"
June 07, 2018 In "Alternative Financial Data" , "Scrapinghub"
api, crawling, dash, Releases, Scrapinghub, Scrapy Cloud

Be the first to know. Gain insights. Make better decisions.

Use web data to do all this and more. We’ve been crawling the web since 2010 and can provide you with web data as a service.

Tell me more


Here we blog about all things related to web scraping and web data.

If you want to learn more about how you can use web data in your company, check out our Data as a Services page for inspiration.

Learn More

Recent Posts