How Is Online Data Collection Making The Internet A Safer Place For All?

A conversation with Roman Hussy, founder of, a non-profit project fighting cyber attacks
6 min read
cyberthreat protection by using data collection

Cybersecurity experts predict that this year, there will be a cyber-attack incident every 11 seconds, with financial damages from cybercrime expected to reach $6 trillion. That’s nearly double what it was in 2019 (every 19 seconds) – and quadruple the rate in 2016 (every 40 seconds). One only needs to look at the recent cyber attack on North America’s biggest petrol pipeline to understand the sheer damage these nefarious players can wreak.

With the accelerated rate of cyber attacks brought about by COVID-19 (the US FBI reported a 300% increase in cyber crimes in 2020 alone), in order to keep up with an ever-evolving ecosystem, and a rapidly changing new normal, both public and private sectors must place the utmost importance on building, and maintaining cyber resilient infrastructures; “As the new normal takes shape, all organizations will need an always-connected defensive posture and clarity on what business risks remote users elevate to remain secure.” Peter Firstbrook, VP Analyst, Gartner.

The cyber ecosystem is continuously changing due to the perpetual ‘push and pull’ between cybercriminals and security experts, and cyber crimes are becoming more sophisticated than ever. But where are the bad actors focusing their efforts in 2021?

Simple blue graphic of hoodie hacker on a laptop surrounded by virtual cyber threats

Image source: Bright Data

The 2021 cyber trends

  • Ransomware attacks: attacks entailing malware that encrypt files making systems inoperable, with attackers essentially holding company data hostage. Fueled by pandemic turmoil, ransomware attacks grew by 62% since 2019. Given the profitable nature of these kinds of attacks, these numbers don’t seem to be slowing down anytime soon.
  • DDoS attacks: attacks involving cybercriminals overloading servers with high volumes of traffic, which eventually bring down a company’s website. With a range of new tools that amplify these kinds of attacks, (there are currently 12.5 million available online), DDoS attacks are certainly intensifying.
  • Remote and cloud attacks: with 2020 forcing the world to work remotely and many companies having to quickly adjust and move to the cloud, in addition to the less secure nature of home networks compared to corporate networks, bad actors find new opportunities to exploit employee networks and wreak havoc on business systems. With many continuing to work from home in 2021, this trend doesn’t look like it will be going anywhere.
  • Fileless attacks: attacks typically begin with a link that redirects to a malicious website. Unlike traditional malware, fileless attacks do not require any code on a victim’s system, so it’s very difficult to detect. Fileless malware attacks increased by 900% in 2020 and are still proving to be a very popular weapon in 2021.

With the rise in heavily sophisticated threat tactics, implementing advanced, multi-layered security measures is simply a must. But how else can organizations and the general public fight these notorious bad actors? And what’s online data got to do with it?

We sat down with Roman Hussy, founder of to learn more about his project, and how he is using online data to fight malware and botnets.

Can you tell us about is a non-profit project that fights malware and botnets. One of the platforms that operates is URLhaus. URLhaus is a platform where vetted, trusted security researchers can exchange information on sites that are being used for malware distribution. So far, the project has identified and taken down over 1 million sites that are being used by bad actors to spread malware. IT security researchers, vendors, and law enforcement agencies rely on data from, which tries to make the internet a safer place for all.

Why do you need to use Bright Data’s platform, as part of The Bright Initiative?

The problem I regularly face is that some of the bad actors try to block automated requests made by URLhaus since this platform regularly checks whether a site is still malicious; it does so by trying to connect to the remote site to check their content. Some bad actors are aware of this process and attempt to block those requests by URLhaus.

This is why I use Bright Data’s services, thanks to The Bright Initiative. By using the data collection platform, I can track these bad actors’ sites and provide this valuable data for free to the community so they can protect themselves from threats originating from these bad websites.

As part of my work, I need to verify in an automated way whether a site poses a threat. As you know, bad actors are getting more and more sophisticated. They have not only figured out how to identify URLhaus’s automated requests sent to these sites but have also started blocking those requests. As a result, these malicious websites did not appear on our list of potential security threats. They passed as if they were legitimate sites when, in fact, they were clearly not.

To fight these sites, I started using Bright Data’s infrastructure to truly verify whether a site poses a threat. By using Bright Data’s services, I can overcome the bad actors’ methods and identify and differentiate the bad sites from the good ones. This obviously also helps the community – by letting them know about current cyber threats.

How is it done?

Once I identify a website that is causing harm, and once I can verify it by using Bright Data, I immediately publish the information on the project website where security researchers, security solutions vendors, or law enforcement teams can use this information and take action. This includes legal action as well as using the information to protect their own networks and users from these proven threats.

This data is available for free for everyone to use. In fact, it protects millions of users.
Anyone can access this data and download it to protect themselves.
I know that these datasets are used very broadly by open-source tools, for example, and also by DNS service/software providers like Cloudflare or Quad9. Using these datasets in such a way clearly protects and saves millions from cybersecurity threats.

There’s no denying that cyber attacks are happening at unprecedented rates – we are privileged to partner with projects such as as part of The Bright Initiative to address this pressing issue and fight cybercrime.