The world is documented. But not for everyone.
To put it mildly, we live in an extraordinary time. The internet is the largest dataset ever created. Every day, billions of data points are generated, published, and made publicly available. Prices, policies, trends, statements, patterns of behaviour, and early warning signs of harm are in constant flux.
This information exists. It’s out there. And yet, access to it, the ability to collect it at scale, structure it, and actually use it in the real world remains profoundly unequal.
For large corporations with engineering teams and infrastructure budgets, public web data is a competitive edge they’ve been using for years. They track markets, monitor competitors, and feed AI models with the web’s signal in near real-time.
For everyone else? A small non-profit trying to monitor online harm. A university research team studying inequality. A public health body watching for early signs of an outbreak. A journalist investigating corporate misconduct. These organisations are often trying to do work that matters most, and doing it with a fraction of the resources.
That’s not just an inconvenience. It’s a structural problem.
And when data access is unequal, outcomes are unequal too. At the Bright Initiative we are working to change that.
Information asymmetry has always shaped power. This is just the latest version.
This isn’t a new dynamic. Throughout history, those who could access, interpret, and act on information faster than others held the advantage. What’s changed is the scale of the gap. The fact that so much of the information that could rebalance things is technically public should be the game changer.
Think about what it means to hold power accountable in the digital age. To track whether a platform is actually removing harmful content. To document patterns of environmental damage from satellite and sensor data. To understand how policy changes ripple through communities. And in an age of mis- and disinformation, to see what narratives are spreading. Where and how fast.
None of these require secret data. The information is public. The problem is the access to the infrastructure to use it responsibly. Right now, unfortunately, that infrastructure is not evenly distributed.
Democratization isn’t just a tech trend. It’s a justice issue.
When I talk about democratizing public web data, I mean something fundamental: ensuring that the capacity to see what’s happening in the world isn’t reserved solely for those with the deepest pockets.
Because here’s what I’ve observed, across hundreds of partnerships with nonprofits, academics, and public institutions: the organisations doing the most important work, protecting communities, holding power to account, advancing knowledge that benefits everyone, are consistently the ones with the least data infrastructure.
They’re not lacking in intelligence, commitment, or mission. They’re lacking in access. And access, it turns out, changes everything. When a small organisation can finally collect and analyse public information at scale, something real and tangible shifts.
The quality of their research improves. Their advocacy becomes evidence-based. Their ability to respond, to document harm, to spot patterns, to act before things get worse, accelerates vastly.
Accountability doesn’t happen in a vacuum
Real accountability requires the ability to check. To compare what’s said against what’s done. To see patterns that individual data points can’t reveal. And that requires the kind of systematic, scaled access to public information that, right now, only few actors in the world can reliably achieve.
This is why I believe democratizing access to public web data isn’t just a nice thing to do. It’s foundational to how accountability works in the digital age. You cannot hold power accountable with one hand tied behind your back.
We’re at an inflection point. The choices made now will matter.
The good news is that things are shifting. The conversation around responsible data access has matured enormously. There’s growing recognition, among technologists, policymakers, and civil society alike, that the question isn’t whether public web data is a powerful force. It clearly is. The question is: who gets to use it, how, and for what?
That conversation needs more voices in it. Not just the big platforms debating their own interests. Not just regulators trying to catch up with technology. But researchers, advocates, journalists, educators. The people actually doing the work on the ground, who understand better than anyone what it would mean to have the same access to public information that well-resourced actors currently take for granted.
I’ve seen what happens when that access is extended. I’ve watched small teams answer questions that seemed unanswerable. I’ve seen researchers spot patterns that changed the conversation. I’ve seen organisations go from working on instinct to working on evidence.
That’s not magic. It’s what access to public information actually looks like in practice.
What I believe
I believe the internet, the public part of it, should be a resource for everyone, not a competitive advantage for the few.
I believe that the organisations working on the hardest problems in the world deserve the same ability to understand that world as those generating profit from it.
And I believe that if we get this right, if we build toward a future where access to public web data is genuinely democratized, where it comes with the right ethical frameworks and responsible practices, we create the conditions for something genuinely remarkable.
A world where it’s harder to hide harm. Where patterns of injustice are harder to deny. Where the signals that were always there finally reach the people positioned to act on them.
We’re not there yet. But the direction is right. And the urgency is real.