What are the dangers/risks of collecting data without proxies?

From building dynamic pricing strategies using outdated competitor datasets to making stock portfolio decisions based on old social media posts/sentiment, business scenarios which do not leverage proxies are rife with negative monetary/business outcomes
dangers and risks of collecting data without a proxies
Roni Sarfati
Roni Sarfati | Director of business development

In this article, we will discuss some common data collection use cases and how they would end with, and without proxies:


Common industry datasets for companies involved both in the front, and back-end of eCommerce-related industries include:

  • Product pricing among competitors
  • Consumer reviews both natively, and across different platforms 
  • Sales volume, and Point of Sale (PoS) data

When a company seeks to collect these data points manually, it is a very slow, and tedious process. Site structures change frequently, and datasets change in real-time. This can lead to some risks with negative monetary/business outcomes, for example: 

  • When you are obtaining competitor pricing that is wrong (as it is being changed quicker than you can collect it) then you are putting your dynamic pricing strategy at risk. You risk losing a lot of sales volume, not only on this specific sale but in the long term, as consumers won’t return to a digital retailer who is perceived as ‘overpriced’. 
  • Reviews help you gain insight into your competitor’s consumer pain points. For example, if their products are hard to assemble, you may offer free assembly with every purchase, thereby boosting sales. But if your competitor has already addressed this issue, and now there’s a different way to attract consumer interest (say overnight shipping), you may not find this out in time with your outdated information, wasting time and resources on an out of date value proposition. 
  • Sales volume helps you understand which products are currently popular and PoS data sheds light on where/how customers prefer to carry out their purchases. This can be very valuable information, for example if pink, star-shaped sunglasses are trending using PayPal, you can use this information to ramp up your order quantities / production levels / marketing campaigns, and special discounts/coupons for those who complete orders using PayPal. 


Common industry data sets for companies involved in the financial industry include:

  • Securities movement 
  • News stories pertaining to a specific stock or industry 
  • Social media sentiment about a stock (e.g. AMC Entertainment Holdings) or a commodity (such as gold or silver)

The risks of collecting this data without proxies include:

  • Movement or ‘stock volume’, as it is commonly known as, is a very important metric for some stock traders, fund managers, and especially day traders. It is indicative of interest in the stock, willingness to buy/sell as well as current price stability, and mobility. When you are trading based on inaccurate stock volumes your decisions can be negatively or positively skewed, leading you to make bad decisions regarding your or your customer’s portfolio. 
  • Securities are very sensitive to news stories, for example if the FDA approves a drug and that story goes viral it impacts prices. If the CEO is indicted for fraud, and that story comes out, it has real financial consequences. When you trade based on old news, you lose momentum, as well as your concrete informational advantage. 
  • Social sentiment, as was the case with the Reddit-based Wall Street Bets (WSB) group, had a major impact on stock movement. When the masterminds behind ‘The Big Short Squeeze’ uploaded posts to ‘hold AMC stock’, as it was ‘going to the MOON!’ that meant something in real terms as far as the stock’s valuation. 


Common industry data sets for companies involved both in manpower / talent-sourcing related industries include:

  • People data from social media /business networks (including special talents, out of the ordinary work/training experience, language skills, proficiency in specific computer programs)
  • Company data (number of employees, rate of growth, Uniques Sales Proposition in their industry etc)

The risks of collecting this data without proxies include:

  • Collecting people data that is inaccurate for a variety of reasons. For example the person in question could already be employed, and their skills may have changed in a way that makes them more /less attractive to potential employers.
  • Company data also runs the risk of changing at a fast pace. So for example, if the company in question was a tiny startup when you first logged it into your systems, and has experienced explosive growth over the past 6 months, it may be less attractive to certain potential employees. This is because some very talented people actually prefer working for smaller companies where their ability to have a real impact is much greater than at a larger corporation.

The bottom line

Performing data manually, without the help of a proxy can not only be slow, and tedious. But more importantly, it can distort your ability to make smart business decisions based on accurate real-time data. Using proxies is quicker, more efficient, and will provide you with an accurate live industry of your competitors and target audiences.

Roni Sarfati
Roni Sarfati | Director of business development

Experienced Senior Business Development Manager with a demonstrated history of working in the online media industry and SaaS. Skilled in Negotiation, Business Planning, Operations Management, Analytical Skills, and Import. Strong sales professional with a BA focused in Political Science from Tel Aviv University.

You might also be interested in

What is data aggregation

Data Aggregation – Definition, Use Cases, and Challenges

This blog post will teach you everything you need to know about data aggregation. Here, you will see what data aggregation is, where it is used, what benefits it can bring, and what obstacles it involves.
What is a data parser featured image

What Is Data Parsing? Definition, Benefits, and Challenges

In this article, you will learn everything you need to know about data parsing. In detail, you will learn what data parsing is, why it is so important, and what is the best way to approach it.
What is a web crawler featured image

What is a Web Crawler?

Web crawlers are a critical part of the infrastructure of the Internet. In this article, we will discuss: Web Crawler Definition A web crawler is a software robot that scans the internet and downloads the data it finds. Most web crawlers are operated by search engines like Google, Bing, Baidu, and DuckDuckGo. Search engines apply […]

A Hands-On Guide to Web Scraping in R

In this tutorial, we’ll go through all the steps involved in web scraping in R with rvest with the goal of extracting product reviews from one publicly accessible URL from Amazon’s website.

The Ultimate Web Scraping With C# Guide

In this tutorial, you will learn how to build a web scraper in C#. In detail, you will see how to perform an HTTP request to download the web page you want to scrape, select HTML elements from its DOM tree, and extract data from them.
Javascript and node.js web scraping guide image

Web Scraping With JavaScript and Node.JS

We will cover why frontend JavaScript isn’t the best option for web scraping and will teach you how to build a Node.js scraper from scratch.
Web scraping with JSoup

Web Scraping in Java With Jsoup: A Step-By-Step Guide

Learn to perform web scraping with Jsoup in Java to automatically extract all data from an entire website.
Static vs. Rotating Proxies

Static vs Rotating Proxies: Detailed Comparison

Proxies play an important role in enabling businesses to conduct critical web research.