How to Use HTTPX with Proxies: A Complete Guide

Today, we’re going to learn how to use proxies with HTTPX. A proxy sits between your scraper and the site you’re trying to scrape. Your scraper makes a request to the proxy server for the target site. The proxy then fetches the target site and returns it to your scraper.

How To Use Unauthenticated Proxies

In summary, all of our requests go to a proxy_url. Below is an example using an unauthenticated proxy. This means that we’re not using a username or password. This example was inspired by their documentation.

import httpx

proxy_url = "http://localhost:8030"


with httpx.Client(proxy=proxy_url) as client:
    ip_info = client.get("https://geo.brdtest.com/mygeo.json")
    print(ip_info.text)

How To Use Authenticated Proxies

When a proxy requires a username and password, it’s called an “authenticated” proxy. These credentials are used to authenticate your account and give you a connection to the proxy.

With authentication, our proxy_url looks like this: http://<username>:<password>@<proxy_url>:<port_number>. In the example below, we use both our zone and username to create the user portion of the authentication string.

We’re using datacenter proxies for our base connection.

import httpx

username = "your-username"
zone = "your-zone-name"
password = "your-password"

proxy_url = f"http://brd-customer-{username}-zone-{zone}:{password}@brd.superproxy.io:33335"

ip_info = httpx.get("https://geo.brdtest.com/mygeo.json", proxy=proxy_url)

print(ip_info.text)

The code above is pretty simple. This is the basis for any sort of proxy you want to setup.

First, we create our config variables: username, zone and password.
We use those to create our proxy_url: f"http://brd-customer-{username}-zone-{zone}:{password}@brd.superproxy.io:33335".
We make a request to the API to get general information about our proxy connection.

Your response should look similar to this.

{"country":"US","asn":{"asnum":20473,"org_name":"AS-VULTR"},"geo":{"city":"","region":"","region_name":"","postal_code":"","latitude":37.751,"longitude":-97.822,"tz":"America/Chicago"}}

How To Use Rotating Proxies

When we use rotating proxies, we create a list of proxies and choose from them randomly. In the code below, we create a list of countries. When we make a request, we use random.choice() to use a random country from the list. Our proxy_url gets formatted to fit the country.

The example below creates a small list of rotating p roxies.

import httpx
import asyncio
import random


countries = ["us", "gb", "au", "ca"]
username = "your-username"
proxy_url = "brd.superproxy.io:33335"

datacenter_zone = "your-zone"
datacenter_pass = "your-password"


for random_proxy in countries:
    print("----------connection info-------------")
    datacenter_proxy = f"http://brd-customer-{username}-zone-{datacenter_zone}-country-{random.choice(countries)}:{datacenter_pass}@{proxy_url}"

    ip_info = httpx.get("https://geo.brdtest.com/mygeo.json", proxy=datacenter_proxy)

    print(ip_info.text)

This example really isn’t all that different from your first. Here are the key differences.

We create an array of countries: ["us", "gb", "au", "ca"].
Instead of making a single request, we make multiple ones. Each time we create a new request, we use random.choice(countries) to choose a random country each time we create our proxy_url.

How To Create a Fallback Proxy Connection

In the examples above, we’ve used only datacenter and free proxies. Free proxies aren’t very reliable. Datacenter proxies tend to get blocked with more difficult sites.

In this example, we create a function called safe_get(). When we call this function, we first try to get the url using a datacenter connection. When this fails, we fall back to our residential connection.

import httpx
from bs4 import BeautifulSoup
import asyncio


country = "us"
username = "your-username"
proxy_url = "brd.superproxy.io:33335"

datacenter_zone = "datacenter_proxy1"
datacenter_pass = "datacenter-password"

residential_zone = "residential_proxy1"
residential_pass = "residential-password"

cert_path = "/home/path/to/brightdata_proxy_ca/New SSL certifcate - MUST BE USED WITH PORT 33335/BrightData SSL certificate (port 33335).crt"


datacenter_proxy = f"http://brd-customer-{username}-zone-{datacenter_zone}-country-{country}:{datacenter_pass}@{proxy_url}"

residential_proxy = f"http://brd-customer-{username}-zone-{residential_zone}-country-{country}:{residential_pass}@{proxy_url}"

async def safe_get(url: str):
    async with httpx.AsyncClient(proxy=datacenter_proxy) as client:
        print("trying with datacenter")
        response = await client.get(url)
        if response.status_code == 200:
            soup = BeautifulSoup(response.text, "html.parser")
            if not soup.select_one("form[action='/errors/validateCaptcha']"):
                print("response successful")
                return response
    print("response failed")
    async with httpx.AsyncClient(proxy=residential_proxy, verify=cert_path) as client:
        print("trying with residential")
        response = await client.get(url)
        print("response successful")
        return response

async def main():
    url = "https://www.amazon.com"
    response = await safe_get(url)
    with open("out.html", "w") as file:
        file.write(response.text)

asyncio.run(main())

This example is a bit more complicated than the other ones we’ve dealt with in this article.

We now have two sets of config variables, one for our datacenter connection, and one for our residential connection.
This time, we use an AsyncClient() session to introduce some of the more advanced functionality of HTTPX.
First, we attempt to make our request with the datacenter_proxy.
If we fail to get a proper response, we retry the request using our residential_proxy. Also note the verify flag in the code. When using our residential proxies, you need to download and use our SSL certificate.
Once we’ve got a solid response, we write the page to an HTML file. We can open this page up in our browser and see what the proxy actually accessed and sent back to us.

If you try the code above, your output and resulting HTML file should look a lot like this.

trying with datacenter
response failed
trying with residential
response successful

How Bright Data Products Help

As you’ve probably noticed throughout this article, our datacenter proxies are very affordable and our residential proxies provide an excellent fallback for when datacenter proxies don’t work. We also provide various other tools to assist with your data collection needs.

Web Unlocker: Get past even the most difficult anti-bots. Web Unlocker automatically recognizes and solves any CAPTCHAs on the page. Once it’s through the anti-bots, it sends you back the web page.
Scraping Browser: This product has even more features. Scraping Browser actually allows you to control a remote browser with proxy integration and an automated CAPTCHA solver.
Web Scraper APIs: With these APIs, we do the scraping for you. All you need to do is call the API and parse the JSON data you receive in the response.
Datasets: Explore our dataset marketplace to find hundreds of pre-collected datasets, or request/build a custom one. You can choose a refresh rate and filter only the data points you need.

Conclusion

When you combine HTTPX with our proxies you get a private, efficient, and reliable way to scrape the web. If you want to rotate proxies, it’s as simple as using Python’s built-in random library. With a combination of datacenter and residential proxies, you can build a redundant connection that gets past most blocking systems.

As you learned, Bright Data offers the full package for your web scraping projects. Start your free trial with Bright Data’s proxies today!

Start free trial

Start free with Google

No credit card required

How To Use Proxies With HTTPX

How To Use Unauthenticated Proxies

How To Use Authenticated Proxies

How To Use Rotating Proxies

How To Create a Fallback Proxy Connection

How Bright Data Products Help

Conclusion

150M+ Residential Proxies

770K Datacenter Proxies

You might also be interested in

Integrate Qwen-Agent with MCP to Build Agents with Live Data Access

Enhancing CrewAI Agents Using SERP Scraping APIs via RAG

Traditional Web Scraping vs. Model Context Protocol (MCP): The Developer’s Guide