Your Data Won’t Serve You For Long If It Was Collected Unethically

This article will give you practical tools and guidelines so your company can build solid data foundations and not a house of cards.
7 min read
Ethical Data Collection Scales of Justice blue background graphic

You already understand why it’s important to comply with data privacy regulations – right? What about your company’s data collection standards? Are they up to par?

The following sections will help you take preemptive measures to ensure that your data collection operations are 100% ethical:

Ethically compliant proxy provider checklist

No matter which data collection provider you choose to use, it is important for you to make sure the provider is ethical, obtains its peers legally, and is able to handle data collection at the scale that your business needs.

Here is a 7-part checklist which you should carefully review and use for every proxy/data collection provider you are considering using:

#1: Proxy Sourcing – Ask: ‘How does your company source its Residential and Mobile IPs?’

Legitimate data collection networks and proxy providers will be happy to give you specific names of peer network applications and software.

#2: Peer Consent – Ask: ‘Do you ask for peer consent to use their IP and inform them that their device is being utilized as a peer in a commercial proxy network?’

The only legal and ethical means of obtaining real Residential and Mobile IPs is through informed user consent.

#3: Opt-out – Ask: ‘Are your peers able to easily opt-out of the network at any given time?’

Needless to say holding peers captive, against their will as members of a data collection peer-to-peer network is highly unethical.

#4: GDPR and CCPA compliant –Ask: ‘What personally identifiable information (PII) do you collect from peers? Do you treat PII according to the GDPR guidelines?’

Legal compliance is a basic tenet of ethics in the majority of cases. Ensure that the provider you are considering using is compliant in terms of international data protection laws.

#5: Peer remuneration – Ask: ‘How do you compensate peers for participation in your proxy network?’

Any business that wants to benefit from an individual’s resources should provide each participant with ample compensation. In this case, ensure that users are being fairly compensated for participation in the form of a free membership upgrade, ad-free experience, and the like.

#6: Idle resources – Ask: ‘What are your conditions for when traffic can be routed through a peer’s device? Do you use a peer’s resources only when their device is idle and has sufficient battery power?’

Data collection companies who truly care for the well-being of their network participants will never compromise a peer’s User Experience for the sake of routing proxy network traffic.

#7: SDK termination – Ask: ‘When the application that utilizes the vendor’s SDK is uninstalled, is the SDK uninstalled as well?’

A legitimate proxy provider will make sure its SDK is uninstalled once the app that contains it is uninstalled from the peers’ device.

This is a very accurate and easily applicable system of rating how ethical and legally compliant any given data collection / proxy provider is. Please note that only providers who receive a 7/7 score should be used, as failure to comply with even one of the above 7 sections could put your data collection and business at serious risk!

Here is a comparison chart that we have made easily downloadable in PDF format for your convenience:

checklist to make sure your data collection proxy provider is ethical and ethically compliant

Image source: Bright Data

Pioneering ethics in the data collection arena

From the get-go, Bright Data put the importance of ethical data collection practices front and center in its company DNA. In fact, one of the company’s main mission statements is: creating a fair and ethical means of collecting data that can increase competition that benefits both businesses and consumers. Bright Data is a very action-oriented company which is why its mission statement has been put into practice from its first day in business including:

IP procurement – Is done in a public and consensual manner through peer-to-peer programs such as EarnApp.

The ‘Bright Data Bounty Program’ – Invites the public to spot and alert Bright Data of any perceived security breaches.

Opt-out commitment – Peers can opt-out at any time and have a Bright Data commitment that both the app and the SDK will be completely removed from partner devices. Read more about ‘What makes our SDK ethical?’

Third-party audits – Bright Data continuously works with leading independent firms to ensure its networks are up to regulation, security, and legal standards.

Quality over quantity – Bright Data carefully screens its SDK partners who serve as the source for the majority of its peer base to ensure that only the highest quality peers partake in the Bright Data network. This in-turn serves as a guarantee for network user’s security.

Data for all – From automated data unlocking technology to data collection automation, Bright Data is actively working to make data a commodity that is accessible to everyone (not just large corporations).

Ethical onboarding – Bright Data works hard to ensure that not only its peer network is ethical and compliant but that all current and future network users only utilize the network for legal and ethical use cases. This entails a rigorous Know Your Customer (KYC) process, ongoing usage log check, and a dedicated compliance officer and team.

Zero IP reselling- Bright Data has a zero IP reselling policy. Period. This is extremely important as IP reselling is an unethical widespread practice in the proxy network industry. When an IP is resold it can create an absurd scenario in which a company is paying two separate providers to use the same IP while getting subpar results.

Web Summit 2020 Splash Banner - The Biggest Tech Event Of The Year

 

Image source: Bright Data

Beyond ethics, cutting-edge data collection technology

A one-on-one conversation with Bright Data’s CTO, Ron Kol reveals how its technological advances are leading the business community into the future of data collection.

Q: What is the most important factor in a proxy network from a technology standpoint? Is IP pool size still as important as it used to be or is the industry moving in a different direction?

A: Pool size is still important but it’s no longer enough to guarantee success. Websites have begun to look beyond the Network level (IP) into the protocol level (SSL, HTTP) and the browser level (user fingerprints) and its becoming harder to gain access even to public information. Success rates will drop dramatically against sites that have started to deploy blocks on these levels. Higher-level services – like Bright Data’s Web Unlocker and Web Scraper IDE tools are required to get good success rates.

Q: What would you say the biggest challenge of data collection technology is right now and looking 10 years into the future?

A: The biggest challenge right now is how to commoditize public data collection. How to build large scale dependable public data collection platforms of high quality validated data and make them accessible to all, not just engineers.

Looking into the future the technical challenge would be to create an indexed public information aggregated database where you could look up any public information dataset of any website as easy as you do a search on Google. Mapping the public internet into business usable DataSets and making it transparent and accessible to all.

Summing up

However you choose to source your data it is important to ensure that it is ethically sound, and legally compliant. Doing your own independent due diligence is just as important on the customer end as it is on the data collection and proxy network end. When you do your homework and ensure that the data engine powering your business is running on oil and not water, can you be certain that business performance will be optimal over the long haul. Gaining access to next-gen data collection technology won’t hurt either.