The Web Data Revolution
How Public Data is Changing The Way Companies See, Think and Make Decisions
It’s said that, in the early days of Walmart, Sam Walton would monitor how stores were doing by counting the cars in the parking lot.
Now, imagine if he could count the cars at every Walmart and cross-reference that data with local promotions in order to optimize his advertising?
- Or if he could monitor the online price moves of every competitor by zip code in order to dial-in his pricing strategies?
- Or if he could capture the make, model and age of every car to decide the best locations to add autocenters?
Sixty years after Sam opened his first store, the power of collecting web data is making these once-unknowable questions remarkably knowable. It’s making the boardrooms, product labs and marketing departments of many top companies look more like intelligence agencies.
This revolution is partly technical: Vast networks of proxy servers, purpose-built for large scale public web data collection, have made it possible to collect, structure and interpret data in wildly useful new ways. This is a change on the order of Salesforce transforming customer relationship management or Slack replacing office email.
It’s also philosophical: The art of leadership now depends less on being able to guess the right answers, and more on formulating good questions. Get the questions right and, to an outside observer, your competitive advantage can look downright unfair.
But asking good questions doesn’t always come naturally.
How much time do you spend doing research before you make a big decision? For many of us, the answer is not much. Humans naturally make assumptions to simplify a complex world – spotting patterns, drawing on experiences, sharing tips, and following our guts.
This mode of thinking is often surprisingly accurate, but it also comes with dozens of blindspots, such as recency bias, which means relying too heavily on our most recent memories, confirmation bias, which means unconsciously seeking information that supports what we already believe, and narrative fallacy, which is the human tendency to believe the best story rather than the strongest data.
Capitalizing on the public data revolution calls for a more Zen approach. That means approaching business decisions without preconceptions, even when studying situations we think we know very well.
Clearing away existing assumptions can help you stumble upon new information, like a beginner might. It can reveal connections that are hiding in plain sight – like how the number of cars in the parking lot aligns with the number of dollars hitting the bottom line.
Data lets you see marketplace dynamics across products, pricing, inventory, supply chains and consumer behaviors. It tells you what customers are doing, surfaces critical trends, and can even help you anticipate what your competitors will do next.
Gathering insight from public web data presents a tipping point. Companies that missed the rise of the Internet or the mobile economy paid dearly. Miss the web data revolution, and you could become the next Blockbuster Video.
How fast is this growing? At Bright Data, we process 15 billion requests per day, or more than 1.5 times all the world’s search engines combined. That number is steadily increasing.
The answers are out there. Now here are some of the bright questions our clients are asking in order to find them:
An investment analyst asks:
“What is the retail price trend for bleach?”
Micah is an analyst at a Boston hedge fund where there’s a lot of pressure to generate new investment ideas. Browsing his stock screeners, a chemical company caught his eye. The company’s numbers looked strong, but he wanted to understand if they could maintain their pricing power in an inflationary environment.
That’s when he hit on an idea: Since the company was one of the largest vendors to the bleach industry, Micah could track bleach prices in every market that the company supplied. He uncovered a clear trend of rising retail prices, indicating that the company had pricing power and could likely protect its margins.
In today’s investment parlance, public web data like this is called alternative data, and most hedge funds are using it to supplement conventional sources such as earnings calls and annual reports. Think aerial imagery of distribution centers, social media listening to investor sentiment, or weather monitoring to predict crops.
The global alternative data market is expected to grow at 46.5% annually and be worth $13.91 billion by 2026.
Source: Alternative Data Global Market Report 2022
A Chief Marketing Officer asks:
“How are people selling my product online?”
Jill heads marketing for a European fashion brand where she’s responsible for maintaining an impeccable brand image and customer experience.
This means every photograph has to be on brand. Descriptions must be correct. Promotions have to follow seasonal campaigns. There can be no unauthorized sales or discounts. Shipping times have to be quick.
But, in a world with tens of thousands of online resellers, how do you maintain brand integrity? For Jill’s team, the answer is public web data. They continually search the Internet for their products and feed the results to an algorithm that can notify them of any concerning findings.
Being able to search text, images, audio and video at scale is being used to protect all kinds of brands and their intellectual property. It helps artists spot uses of their work online, and allows social media platforms to block the upload of unlicensed material.
Bright Data scans roughly a Petabyte of data every day. That’s 500 billion pages of text,or over 100,000 DVD-quality movies.
Even if you’re searching for a needle in a haystack, we can find it.
A travel CEO asks:
“How can I deliver the best travel deals in the world?”
Bert’s company lives or dies by web traffic. His challenge is to make sure that his website remains the most accurate and complete destination on the web for deals from travel agencies, hotels and airlines.
Although many travel providers make basic information like pricing and schedules available to him through APIs, the data is often fragmented, stale or incomplete.
Bert’s solution? He continuously collects the most current offers across the web. He gets the data in a format that allows his team to algorithmically sort and present it to consumers so they can easily spot the best price, the shortest route, the perfect date, and even free wifi if that’s what matters to them.
Web data is powerful for aggregators like Bert, whether you wish to gather credit card offers or the best lawn maintenance companies. It also works for the end suppliers – for example, the hotels on Bert’s site can collect data to understand how their rates stack up against the competition.
Musician-turned-tech investor will.i.am has said “WWW stands for Wild Wild West.”
Public web data collection brings focus and structure to the chaos of the Internet.
A startup founder asks:
“How can my product stand out from the competition?”
Maia came up with a great product – earphones with a built-in equalizer that lets you personalize the sound based on age-related hearing changes. But without a great business strategy, she knows her product will never get off the ground.
Her first move was to use public data collection tools to discover every competitive product on Amazon and other retail sites. She built a database of prices, features, rankings and everything else that could help her position her product in the sweet spot.
Next was to set up the continuous tracking of competitive data – everything from prices, discounts, inventory levels and shipping times to variables like colors, up-sells and user reviews. On top of that, each variable could be segmented by country and region down to the area code.
All of this gave Maia a serious edge when it came to setting up her marketing funnels, pricing strategies and email promotions. Virtually every ecommerce business can follow this approach to capture and structure web data into real, actionable intelligence.
Bright Data CEO, Or Lenchner, says “The internet is the world’s biggest database, and now you can query it.”
Imagine having a search engine designed to give you actionable business intelligence.
You can think of web data collection as going to the public library and reading every book. Providers like Bright Data have a vested interest in keeping this library open, and take extra precautions to ensure its integrity. We do this through a code of ethics as well as software protocols that prevent any misuse of the technology.
We see the collection and analysis of public web data as a significant positive development for society. It’s a strong competitive advantage for business. It enhances the choices for consumers. And it also helps us grapple with some of the most important environmental and social issues of our time.
For example, on a cool spring morning last year, police kicked in the door of an apartment in south Chicago. Once inside, they arrested two people suspected of human trafficking. They rescued four women, aged 17 to 26, and recovered cash, drugs and guns.
These arrests were part of an operation that collected data from online commercial sex advertisements, data science, and network analysis to identify potential trafficking activity.
This story comes courtesy of The Bright Initiative – our ongoing commitment to using public web data collection to drive progress in areas such as human rights, regulation, climate change, public health and Internet safety.
Today, the web data adoption curve is rapidly accelerating. Users are asking smart questions and finding valuable answers across every possible domain – from realtors to portfolio managers, manufacturers to musicians, startups to multinationals.
Just imagine if you were the only person on Earth with access to Google. You’d probably be acing every exam, leapfrogging everyone on your career path, and amazing friends with your uncanny knowledge. It would probably be obvious that you were working with a secret advantage.
Now, think of Micah bringing the best investment ideas to the table. Jill building a pristine brand. Bert dominating the travel space. Maia getting traction with her startup. Not to mention acts of justice and heroism all powered by the wonder of web data.
Web data collection tools are like having a superpower. And, at the current pace of change, failing to grasp them now could leave you with that sinking feeling you get when you realize you’re trapped somewhere without wifi.
But there’s no reason for you to get stuck. The tools are laid out before you. All you have to do is work on phrasing the right questions and then…
Just ask.