In this guide, you will learn:
- What cURL Impersonate is
- The reasons behind the project and how it works
- How to use it via the command line
- How to use it in Python
- Advanced techniques and aspects
Let’s dive in!
What Is cURL Impersonate?
cURL Impersonate is a special build of cURL designed to mimic the behavior of major browsers(i.e., Chrome, Edge, Safari, and Firefox). In detail, this tool performs TLS and HTTP handshakes that closely resemble those of real browsers.
The HTTP client can be used either through the curl-impersonate
command-line tool, similar to regular curl
, or as a library in Python.
These are the browsers that can be impersonated:
Browser | Simulated OS | Wrapper Script |
Chrome 99 | Windows 10 | curl_chrome99 |
Chrome 100 | Windows 10 | curl_chrome100 |
Chrome 101 | Windows 10 | curl_chrome101 |
Chrome 104 | Windows 10 | curl_chrome104 |
Chrome 107 | Windows 10 | curl_chrome107 |
Chrome 110 | Windows 10 | curl_chrome110 |
Chrome 116 | Windows 10 | curl_chrome116 |
Chrome 99 | Android 12 | curl_chrome99_android |
Edge 99 | Windows 10 | curl_edge99 |
Edge 101 | Windows 10 | curl_edge101 |
Firefox 91 ESR | Windows 10 | curl_ff91esr |
Firefox 95 | Windows 10 | curl_ff95 |
Firefox 98 | Windows 10 | curl_ff98 |
Firefox 100 | Windows 10 | curl_ff100 |
Firefox 102 | Windows 10 | curl_ff102 |
Firefox 109 | Windows 10 | curl_ff109 |
Firefox 117 | Windows 10 | curl_ff117 |
Safari 15.3 | macOS Big Sur | curl_safari15_3 |
Safari 15.5 | macOS Monterey | curl_safari15_5 |
Each supported browser has a specific wrapper script. That configures curl-impersonate
with the appropriate headers, flags, and settings to simulate a specific browser.
How curl-impersonate
Works
When you send a request to a website over HTTPS, a process called the TLS handshake occurs. During this handshake, details about the HTTP client are shared with the web server, creating a unique TLS fingerprint.
HTTP clients have capabilities and configurations that differ from those of a standard browser. This discrepancy results in a TLS fingerprint that can easily reveal the use of HTTP clients. As a result, anti-bot measures used by the target site can detect your requests as automated and potentially block them.
cURL Impersonate addresses this issue by modifying the standard curl
tool to make its TLS fingerprint match that of real browsers. Here is how it achieves the goal:
- TLS library modification: For the Chrome version of
curl-impersonate
,curl
is compiled with BoringSSL, Google’s TLS library. For the Firefox version ,curl
is compiled with the NSS, the TLS library used by Firefox. - Configuration adjustments: It modifies how cURL configures various TLS extensions and SSL options to mimic the settings of real browsers. It also adds support for new TLS extensions that are commonly used by browsers.
- HTTP/2 handshake customization: It changes the settings cURL uses for HTTP/2 connections to align with those of real browsers.
- Non-default flags: It runs with specific non-default flags, such as
--ciphers
,--curves
, and some -H headers, which further helps in mimicking browser behavior.
Thus, curl-impersonate
makes curl
requests appear from a network perspective as if they were made by a real browser. This is useful for bypassing many bot detection mechanisms!
curl-impersonate: Command Line Tutorial
Follow the steps below to learn how to use cURL Impersonate from the command line.
Note: For completeness, multiple installation methods will be displayed. However, you need to choose only one. The recommended method is using Docker.
Installation From Pre-Compiled Binaries
You can download pre-compiled binaries for Linux and macOS from the GitHub releases page of the project. These binaries contain a statically compiled curl-impersonate
. Before using them, ensure you have the following installed:
- NSS (Network Security Services): A set of libraries designed to support cross-platform development of security-enabled client and server applications. NSS is used in Mozilla products like Firefox and Thunderbird for handling the TLS protocol.
- CA certificates: A collection of digital certificates that authenticate the identity of servers and clients during secure communications. They ensure that your connection to a server is trustworthy by verifying that the server’s certificate has been signed by a recognized CA (Certificate Authority).
To meet the prerequisites, on Ubuntu, run:
sudo apt install libnss3 nss-plugin-pem ca-certificates
On Red Hat, Fedora, or CentOS, execute:
yum install nss nss-pem ca-certificates
On Archlinux, launch:
pacman -S nss ca-certificates
On macOS, fire this command:
brew install nss ca-certificates
Also, ensure you have zlib
installed on your system, as the pre-compiled binary packages are gzipped.
Installation Through Docker
Docker images—based on Alpine Linux and Debian—with curl-impersonate compiled and ready to use are available on Docker Hub. These images include the binary and all necessary wrapper scripts.
The Chrome images(*-chrome
) can impersonate Chrome, Edge, and Safari. Instead, the Firefox images(*-ff
) can impersonate Firefox.
To download the Docker image you prefer, use one of the commands below.
For Chrome version on Alpine Linux:
docker pull lwthiker/curl-impersonate:0.5-chrome
For Firefox version on Alpine Linux:
docker pull lwthiker/curl-impersonate:0.5-ff
For Chrome version on Debian:
docker pull lwthiker/curl-impersonate:0.5-chrome-slim-buster
For Firefox version on Debian:
docker pull lwthiker/curl-impersonate:0.5-ff-slim-buster
Once downloaded, as you are about to see, you can execute curl-impersonate
using a docker run command.
Installation From Distro Packages
On Arch Linux, curl-impersonate
is available through the AUR package curl-impersonate-bin
.
On macOS, you can install the unofficial Homebrew package for the Chrome version with the following commands:
brew tap shakacode/brew
brew install curl-impersonate
Basic Usage
Regardless of the installation method, you can now execute a curl-impersonate
command using this syntax:
curl-impersonate-wrapper [options] [target-url]
Or, equivalently, on Docker, run something like:
docker run --rm lwthiker/curl-impersonate:[curl-impersonate-version]curl-impersonate-wrapper [options] [target_url]
Where:
curl-impersonate-wrapper
is the cURL Impersonate wrapper you want to use (e.g.,curl_chrome116
,curl_edge101
,curl_ff117
,curl_safari15_5
, etc.).options
are the optional flags that will be passed on to cURL.target-url
is the URL of the web page to make an HTTP request to.
Be cautious while specifying custom options as some flags alter cURL’s TLS signature, potentially making it detectable. To learn more, check out our introduction to CURL.
Note that the wrappers automatically set a default collection of HTTP headers. To customize these headers, modify the wrapper scripts to suit your needs.
Now, let’s use curl-impersonate
to make a request to the Wikipedia homepage using a Chrome wrapper:
curl_chrome110 https://www.wikipedia.org
Or, if you are a Docker user:
docker run --rm lwthiker/curl-impersonate:0.5-chrome curl_chrome110 https://www.wikipedia.org
The result will be:
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<title>Wikipedia</title>
<meta name="description" content="Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.">
<!-- omitted for brevity... -->
Wonderful! The server returned the HTML of the desired page as if you were accessing it via a browser.
You can now use cURL Impersonate for web scraping just as you would use cURL for web scraping.
curl-impersonate
: Python Tutorial
Command line usage is great for testing, but web scraping processes typically rely on custom scripts written in languages like Python. Discover the best programming languages for web scraping!
Fortunately, you can use cURL Impersonate in Python thanks to curl-cffi
. This is a Python binding for curl-impersonate
via cffi
. In particular, curl-cffi
can impersonate browsers’ TLS/JA3 and HTTP/2 fingerprints to connect to web pages without getting blocked.
See how to use it in the step-by-step section below!
Prerequisites
Before getting started, make sure you have:
- Python 3.8+ installed on your machine
- A Python project with a virtual environment set up
Optionally, a Python IDE like Visual Studio Code with the Python extension is recommended.
Installation
Install curl_cfii
via pip as follows:
pip install curl_cfii
Usage
curl_cffi
provides both a low-level curl
API and a high-level requests-like API. Find out more in the official documentation.
Typically, you want to use the requests-like API. To do this, import requests
from curl_cffi
:
from curl_cffi import requests
You can now use the Chrome version of cURL Impersonate in Python to connect to a web page with:
response = requests.get("https://www.wikipedia.org", impersonate="chrome")
Print the response HTML with:
print(response.text)
Put it all together, and you will get:
from curl_cffi import requests
# make a GET request to the target page with
# the Chrome version of curl-impersonate
response = requests.get("https://www.wikipedia.org", impersonate="chrome")
# print the server response
print(response.text)
Run the above Python script, and it will print:
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<title>Wikipedia</title>
<meta name="description" content="Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.">
<!-- omitted for brevity... -->
Great! You are now ready to perform web scraping in Python, just as you would with Requests and Beautiful Soup. For more guidance, follow our guide on web scraping with Python.
cURL Impersonate Advanced Usage
Time to explore some advanced usages and techniques!
Proxy Integration
Simulating browser fingerprints may not be enough. Anti-bot solutions might still block you, especially if you make too many automated requests in a short amount of time. This is where proxies come in!
By routing your request through a proxy server, you can get a fresh IP address and protect your identity.
Suppose the URL to your proxy server is:
http://84.18.12.16:8888
cURL Impersonate supports proxy integration via the command line using the-x flag:
curl-impersonate -x http://84.18.12.16:8888 https://httpbin.org/ip
For more details, read how to set a proxy in cURL.
In Python, you can set up a proxy similarly to how you would with requests
:
from curl_cffi import requests
proxies = {"http": "http://84.18.12.16:8888", "https": "http://84.18.12.16:8888"}
response = requests.get("https://httpbin.org/ip", impersonate="chrome", proxies=proxies)
For additional information, see how to integrate a proxy with Python requests.
Libcurl Integration
libcurl-impersonate is a compiled version of libcurl
that includes the same cURL Impersonate features. It also offers an extended API for adjusting TLS details and header configurations.
libcurl-impersonate
can be installed using the pre-compiled package. Its goal is to facilitate the integration of cURL Impersonate into libraries in various programming languages, such as the curl-cffi
Python package.
Conclusion
In this article, you learned what cURL Impersonate is, how it works, and how to use it both via CLI and in Python. You now understand that it is a tool for making HTTP requests while simulating the TLS fingerprint of real-world browsers.
The problem is that advanced anti-bot solutions like Cloudflare may still detect your requests as coming from a bot. The solution? Bright Data’s Scraper API—a next-generation, all-in-one, comprehensive scraping solution.
Scraper API provides everything you need to perform automated web requests using cURL or any other HTTP client. This full-featured solution handles browser fingerprinting, CAPTCHA solving, and IP rotation for you to bypass any anti-bot technology. Making automated HTTP requests has never been easier!
Register now for a free trial of Bright Data’s web scraping infrastructure or talk to one of our data experts about our scraping solutions.
No credit card required