What is a Python Proxy Server

Python Proxy Server allows routing of HTTP/S requests through a vast network of IPs via Python code. It supports features like IP rotation, session persistence, and geolocation targeting.
13 min read
Python Proxy Server

In this tutorial, you will learn:

Let’s dive in!

What Is a Python Proxy Server?

A Python proxy server is a Python application that acts as an intermediary between clients and the Internet. It intercepts requests from clients, forwards them to the target servers, and sends the response back to the client. By doing so, it masks the client’s identity to the destination servers. 

Read our article to dig into what a proxy server is and how it works

Python’s socket programming capabilities make it easy to implement a basic proxy server, allowing users to inspect, modify, or redirect network traffic. Proxy servers are great for caching, improving performance, and enhancing security when it comes to web scraping.

How to Implement an HTTP Proxy Server in Python

Follow the steps below and learn how to build a Python proxy server script.

Step 1: Initialize Your Python Project

Before getting started, make sure to have Python 3+ installed on your machine. Otherwise, download the installer, execute it, and follow the installation wizard.

Next, use the commands below to create a python-http-proxy-server folder and initialize a Python project with a virtual environment inside it: 

mkdir python-http-proxy-server

cd python-http-proxy-server

python -m venv env

Open the python-http-proxy-server folder in your Python IDE and create an empty proxy_server.py file.

Great! You have everything you need to build an HTTP proxy server in Python.

Step 2: Initialize an Incoming Socket

First, you need to create a web socket server for accepting incoming requests. If you are not familiar with that concept, a socket is a low-level programming abstraction that allows for bidirectional data flow between a client and a server. In the context of a web server, a server socket is used to listen for incoming connections from clients. 

Use the following lines to create a socket-based web server in Python:

port = 8888
# bind the proxy server to a specific address and port
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# accept up to 10 simultaneous connections
server.bind(('127.0.0.1', port))
server.listen(10)

This initializes an incoming socket server and binds it to the http://127.0.0.1:8888 local address. Then, it enables the server to accept connections with the listen() method.

Note: Feel free to change the number of the port the web proxy should listen to. You can also modify the script to read that information from the command line for maximum flexibility. 

socket comes from the Python Standard Library. So, you will have the following import on top of your script:

import socket

To monitor that the Python proxy server has started as required, log this message:

 print(f"Proxy server listening on port {port}...")

Step 3: Accept Client Requests

When a client connects to the proxy server, this needs to create a new socket to handle communication with that specific client. This is how you can do it in Python:

# listen for incoming requests

while True:

    client_socket, addr = server.accept()

    print(f"Accepted connection from {addr[0]}:{addr[1]}")

    # create a thread to handle the client request

    client_handler = threading.Thread(target=handle_client_request, args=(client_socket,))

    client_handler.start()

To handle multiple client requests simultaneously, you should use multithreading as above. Do not forget to import threading from the Python Standard Library:

import threading

As you can see, the proxy server handles incoming requests through the custom handle_client_request() function. See how it is defined in the next steps.

Step 4: Process the Incoming Requests

Once the client socket has been created, you need to use it to:

  1. Read the data from the incoming requests.
  2. Extract the target server’s host and port from that data.
  3. Use it to forward the client request to the destination server.
  4. Get the response and forward it to the original client.

In this section, let’s focus on the first two steps. Define the handle_client_request() function and use it to read the data from the incoming request:

def handle_client_request(client_socket):

    print("Received request:\n")

    # read the data sent by the client in the request

    request = b''

    client_socket.setblocking(False)

    while True:

        try:

            # receive data from web server

            data = client_socket.recv(1024)

            request = request + data

            # Receive data from the original destination server

            print(f"{data.decode('utf-8')}")

        except:

            break

setblocking(False) sets the client socket to non-blocking mode. Then, use recv() to read the incoming data and append it to request in byte format. Since you do not know the size of the incoming request data, you have to read it one chunk at a time. In this case, a chunk of 1024 bytes has been specified. In non-blocking mode, if recv() does not find any data, it will raise an error exception. Thus, the except instruction marks the end of the operation.

Note the logged messages to keep track of what the Python proxy server is doing.

After retrieving the incoming request, you need to extract the destination server’s host and port from it:

host, port = extract_host_port_from_request(request)

In particular, this is what the extract_host_port_from_request() function looks like:

def extract_host_port_from_request(request):

    # get the value after the "Host:" string

    host_string_start = request.find(b'Host: ') + len(b'Host: ')

    host_string_end = request.find(b'\r\n', host_string_start)

    host_string = request[host_string_start:host_string_end].decode('utf-8')

    webserver_pos = host_string.find("/")

    if webserver_pos == -1:

        webserver_pos = len(host_string)

    # if there is a specific port

    port_pos = host_string.find(":")

    # no port specified

    if port_pos == -1 or webserver_pos < port_pos:

        # default port

        port = 80

        host = host_string[:webserver_pos]

    else:

        # extract the specific port from the host string

        port = int((host_string[(port_pos + 1):])[:webserver_pos - port_pos - 1])

        host = host_string[:port_pos]

    return host, port

To better understand what it does, consider the example below. This is what the encoded string of an incoming request usually contains:

GET http://example.com/your-page HTTP/1.1

Host: example.com

User-Agent: curl/8.4.0

Accept: */*

Proxy-Connection: Keep-Alive

extract_host_port_from_request() extracts the web server’s host and port from the “Host:” field. In this case, host is example.com and port is 80 (as a specific port has not been specified). 

Step 5: Forward the Client Request and Handle the Response

Given the target host and port, you now have to forward the client request to the destination server. In handle_client_request(), create a new web socket and use it to send the original request to the desired destination:

# create a socket to connect to the original destination server

destination_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# connect to the destination server

destination_socket.connect((host, port))

# send the original request

destination_socket.sendall(request)

Then, get ready to receive the server response and propagate it to the original client:

# read the data received from the server

# once chunk at a time and send it to the client

print("Received response:\n")

while True:

    # receive data from web server

    data = destination_socket.recv(1024)

    # Receive data from the original destination server

    print(f"{data.decode('utf-8')}")

    # no more data to send

    if len(data) > 0:

        # send back to the client

        client_socket.sendall(data)

    else:

        break

Again, you need to work one chunk at a time as you do not know the size of the response. When data is empty, there is no more data to receive and you can terminate the operation.

Do not forget to close the two sockets you defined in the function:

# close the sockets

destination_socket.close()

client_socket.close()

Awesome! You just created an HTTP proxy server in Python. Time to see the entire code, launch it, and verify that it works as expected!

Step 6: Put It All Together

This is the final code of your Python proxy server script:

import socket

import threading

def handle_client_request(client_socket):

    print("Received request:\n")

    # read the data sent by the client in the request

    request = b''

    client_socket.setblocking(False)

    while True:

        try:

            # receive data from web server

            data = client_socket.recv(1024)

            request = request + data

            # Receive data from the original destination server

            print(f"{data.decode('utf-8')}")

        except:

            break

    # extract the webserver's host and port from the request

    host, port = extract_host_port_from_request(request)

    # create a socket to connect to the original destination server

    destination_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    # connect to the destination server

    destination_socket.connect((host, port))

    # send the original request

    destination_socket.sendall(request)

    # read the data received from the server

    # once chunk at a time and send it to the client

    print("Received response:\n")

    while True:

        # receive data from web server

        data = destination_socket.recv(1024)

        # Receive data from the original destination server

        print(f"{data.decode('utf-8')}")

        # no more data to send

        if len(data) > 0:

            # send back to the client

            client_socket.sendall(data)

        else:

            break

    # close the sockets

    destination_socket.close()

    client_socket.close()

def extract_host_port_from_request(request):

    # get the value after the "Host:" string

    host_string_start = request.find(b'Host: ') + len(b'Host: ')

    host_string_end = request.find(b'\r\n', host_string_start)

    host_string = request[host_string_start:host_string_end].decode('utf-8')

    webserver_pos = host_string.find("/")

    if webserver_pos == -1:

        webserver_pos = len(host_string)

    # if there is a specific port

    port_pos = host_string.find(":")

    # no port specified

    if port_pos == -1 or webserver_pos < port_pos:

        # default port

        port = 80

        host = host_string[:webserver_pos]

    else:

        # extract the specific port from the host string

        port = int((host_string[(port_pos + 1):])[:webserver_pos - port_pos - 1])

        host = host_string[:port_pos]

    return host, port

def start_proxy_server():

    port = 8888

    # bind the proxy server to a specific address and port

    server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    server.bind(('127.0.0.1', port))

    # accept up to 10 simultaneous connections

    server.listen(10)

    print(f"Proxy server listening on port {port}...")

    # listen for incoming requests

    while True:

        client_socket, addr = server.accept()

        print(f"Accepted connection from {addr[0]}:{addr[1]}")

        # create a thread to handle the client request

        client_handler = threading.Thread(target=handle_client_request, args=(client_socket,))

        client_handler.start()

if __name__ == "__main__":

    start_proxy_server()

Launch it with this command:

python proxy_server.py

You should see the following message in the terminal:

Proxy server listening on port 8888...

To make sure that the server works, execute a proxy request with cURL. Read our guide to learn more on how to use cURL with a proxy.

Open a new terminal and run:

curl --proxy "http://127.0.0.1:8888" "http://httpbin.org/ip"

That would make a GET request to the http://httpbin.org/ip destination through the http://127.0.0.1:8888 proxy server.

You should get something like:

{

  "origin": "45.12.80.183"

}

That is the IP of the proxy server. Why? Because the /ip endpoint of the HTTPBin project returns the IP the request comes from. If you are running the server locally, “origin” will correspond to your IP. 

Note: The Python proxy server built here works only with HTTP destinations. Extending it to handle HTTPS connections is quite tricky.

Now, explore the log written by your proxy server Python application. It should contain:

Received request:

GET http://httpbin.org/ip HTTP/1.1

Host: httpbin.org

User-Agent: curl/8.4.0

Accept: */*

Proxy-Connection: Keep-Alive

Received response:

HTTP/1.1 200 OK

Date: Thu, 14 Dec 2023 14:02:08 GMT

Content-Type: application/json

Content-Length: 31

Connection: keep-alive

Server: gunicorn/19.9.0

Access-Control-Allow-Origin: *

Access-Control-Allow-Credentials: true

{

  "origin": "45.12.80.183"

}

This tells you that the proxy server received the request in the format specified by the HTTP protocol. Then, it forwarded it to the destination server, logged the response data, and sent the response back to the client. Why are we sure of that? Because the IPs in “origin” are the same!

Congrats! You just learned how to build an HTTP proxy server in Python!

Pros and Cons of Using a Custom Python Proxy Server

Now that you know how to implement a proxy server in Python, you are ready to see the benefits and limitations of this approach.

Pros:

  • Total control: With a custom Python script like this, you have total control over what your proxy server does. No shady activity or data leakage there!
  • Customization: The proxy server can be extended to include useful features such as logging and caching of requests to improve performance.

Cons:

  • Infrastructure costs: Setting up a proxy server architecture is not easy and costs a lot of money in terms of hardware or VPS services.
  • Hard to maintain: You are responsible for maintaining the architecture of the proxy, especially its scalability and availability. This is a task that only experienced system administrators can tackle.
  • Unreliable: The main issue with this solution is that the exit IP of the proxy server never changes. As a result, anti-bot technologies will be able to block the IP and prevent the server from accessing the desired requests. In other words, the proxy will eventually stop working.

These limitations and drawbacks are too bad to use a custom Python proxy server in a production scenario. The solution? A reliable proxy provider like Bright Data! Create an account, verify your identity, get a free proxy, and use it in your favorite programming language. For example, integrate a proxy into your Python script with requests.

Our huge proxy network involves millions of proxy fast, reliable, secure proxy servers all over the world. Find out why we are the best proxy server provider.

Conclusion

In this guide, you learned what a proxy server is and how it works in Python. In detail, you learned how to build one from scratch using web sockets. You have now become a master of proxies in Python. The main issue with this approach is that the static exit IP of your proxy server will eventually get you blocked. Avoid that with Bright Data’s rotating proxies!

Bright Data controls the best proxy servers in the world, serving Fortune 500 companies and more than 20,000 customers. Its offer includes a wide range of proxy types:

That reliable, fast, and global proxy network is also the basis of a number of web scraping services to effortlessly retrieve data from any site.

No credit card required