Data Collection Blog

What is an ETL pipeline

What is an ETL pipeline

This guide will walk you through the Extraction, Transformation, and Loading stages of a typical business’s ETL data extraction pipeline. It includes an eCommerce use case that can help illustrate how an ETL pipeline can be implemented within the context of a day-to-day digital business workflow
How microdata marketplaces are leading the charge in terms of data monetization

How microdata marketplaces are leading the charge in terms of data monetization

Learn how Data Collection as a Service companies are leveraging the ‘self-serve’ data model in order to decrease consumer friction while shortening sales cycles, as well as making long-term, client-side data supply contracts superfluous
The ultimate guide to leveraging data collection networks in cyber security

The ultimate guide to leveraging data collection networks in cyber security  

Learn how code-based prevention mechanisms, and real-time compliance, are enabling a new level of data network safety for companies involved in cyber security threat intelligence and mitigation
Puppeteer vs Selenium: Main Differences

Puppeteer vs Selenium: Main Differences

This ultimate guide will cover origins of both libraries, key features/functions, and most importantly: How to choose the option that is best for your business
Subak-Climate Change

Public web data on the frontlines of climate change

Civil society efforts have come together under The Climate Subak, a UK accelerator that has placed its focus on the frontlines of the climate change crisis, using public web data and data-sharing to define sustainable steps to reach our environmental objectives.
How public web data will be used to tackle climate change WIP

Five steps to unleashing the power of data against climate change

Climate change is neither theoretical nor new, and while many organizations are already doing their part to fight climate change individually, we need to build a culture of collaboration and incorporate new data strategies at both a private and public level if we ever hope to define what a sustainable future could truly one day look like.
What is a headless browser and what is it used for?

What is a headless browser and what is it used for?

Headless browsers can be utilized for more efficient data collection as it skips graphic elements, cutting straight to ‘command lines’. Adding an element of automation aids in increasing target site success rates, taking care of user-agent rotation, as well as making collecting cookie databases superfluous
Guide to building an efficient big data pipeline architecture (1)

Data pipeline architecture for businesses explained

Choosing the right data pipeline architecture for your business can help enhance your real-time market capture, as well as aiding predictive analytics. Good pipeline structure will also help reduce friction whilst promoting data compartmentalization/uniformity
Schedule a call