How to Use Undetected ChromeDriver for Web Scraping

Discover how Undetected ChromeDriver helps bypass anti-bot systems for web scraping, plus step-by-step guidance, advanced methods, and key limitations.
12 min read
Undetected ChromeDriver for Web Scraping

In this guide, you will learn:

  • What Undetected ChromeDriver is and how it can be useful
  • How it minimizes bot detection
  • How to use it with Python for web scraping
  • Advanced usage and methods
  • Its key limitations and drawbacks
  • Similar technologies

Let’s dive in!

What Is Undetected ChromeDriver?

Undetected ChromeDriver is a Python library that provides an optimized version of Selenium’s ChromeDriver. This has been patched to limit detection by anti-bot services such as:

  • Imperva
  • DataDome
  • Distil Networks

It can also help bypass certain Cloudflare protections, although that can be more challenging. For more details, follow our guide on how to bypass Cloudflare.

If you have ever used browser automation tools like Selenium, you know they let you control browsers programmatically. To make that possible, they configure browsers differently from regular user setups.

Anti-bot systems look for those differences, or “leaks,” to identify automated browser bots. Undetected ChromeDriver patches Chrome drivers to minimize these telltale signs, reducing bot detection. This makes it ideal for web scraping sites protected by anti-scraping measures!

How It Works

Undetected ChromeDriver reduces detection from Cloudflare, Imperva, DataDome, and similar solutions by employing the following techniques:

  • Renaming Selenium variables to mimic those used by real browsers
  • Using legitimate, real-world User-Agent strings to avoid detection
  • Allowing the user to simulate natural human interaction
  • Managing cookies and sessions properly while navigating websites
  • Enabling the use of proxies to bypass IP blocking and prevent rate limiting

These methods help the browser controlled by the library bypass various anti-scraping defenses effectively.

Using Undetected ChromeDriver for Web Scraping: Step-By-Step Guide

Most sites use advanced anti-bot measures to block automated scripts from accessing their pages. Those mechanisms effectively stop web scraping bots as well.

For example, assume you want to scrape the title and description from the following GoDaddy product page:

The GoDaddy target page

With plain Selenium in Python, your scraping script will look something like this:

# pip install selenium

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

# configure a Chrome instance to start in headless mode
options = Options()
options.add_argument("--headless")

# create a Chrome web driver instance
driver = webdriver.Chrome(service=Service(), options=options)

# connect to the target page
driver.get("https://www.godaddy.com/hosting/wordpress-hosting")

# scraping logic...

# close the browser
driver.quit()

If you are unfamiliar with this logic, take a look at our guide on Selenium web scraping.

When you run the script, it will fail because of this error page:

An "Access Denied" page from GoDaddy

In other terms, the Selenium script has been blocked by an anti-bot solution (Akamai, in this example).

So, how do you get around this? The answer is Undetected ChromeDriver!

Follow the steps below to learn how to use the undetected_chromedriver Python library for web scraping.

Step #1: Prerequisites and Project Setup

Undetected ChromeDriver has the following prerequisites:

  • Latest version of Chrome
  • Python 3.6+: If Python 3.6 or later is not installed on your machine, download it from the official site and follow the installation instructions.

Note: The library automatically downloads and patches the driver binary for you, so there is no need to manually download ChromeDriver.

Now, use the following command to create a directory for your project:

mkdir undetected-chromedriver-scraper

The undetected-chromedriver-scraper directory will serve as the project folder for your Python scraper.

Navigate into it and initialize a virtual environment:

cd undetected-chromedriver-scraper
python -m venv env

Open the project folder in your preferred Python IDE. Visual Studio Code with the Python extension or PyCharm Community Edition are both great choices.

Next, create a scraper.py file inside the project folder, following the structure shown below:

scraper.py in the project folder

Currently, scraper.py is an empty Python script. Shortly, you will be adding the scraping logic to it.

In your IDE’s terminal, activate the virtual environment. On Linux or macOS, use:

./env/bin/activate

Equivalently, for Windows, run:

env/Scripts/activate

Amazing! You now have a Python environment ready for web scraping via browser automation.

Step #2: Install Undetected ChromeDriver

In an activated virtual environment, install Undetected ChromeDriver via the undetected_chromedriver pip package:

pip install undetected_chromedriver 

Behind the scenes, this library will automatically install Selenium, as it is one of its dependencies. Thus, you do not need to install Selenium separately. This also means you will have access to all selenium imports by default.

Step #3: Initial Setup

Import undetected_chromedriver:

import undetected_chromedriver as uc

You can then initialize a Chrome WebDriver with:

driver = uc.Chrome()

Just like Selenium, this will open a browser window that you can control using the Selenium API. That means the driver object exposes to all standard Selenium methods, along with some additional features we will explore later.

The key difference is that this version of the Chrome driver is patched to help bypass certain anti-bot solutions.

To close the driver, simply call the quit() method:

driver.quit() 

Here is what a basic Undetected ChromeDriver setup looks like:

import undetected_chromedriver as uc

# Initialize a Chrome instance
driver = uc.Chrome()

# Scraping logic...

# Close the browser and release its resources
driver.quit()

Fantastic! You are now ready to perform web scraping directly in the browser.

Step #4: Use It for Web Scraping

Warning: This section follows the same steps as a standard Selenium setup. If you are already familiar with Selenium web scraping, feel free to skip ahead to the next section with the final code.

First, use the get() method to navigate the browser to your target page:

driver.get("https://www.godaddy.com/hosting/wordpress-hosting")

Next, visit the page in incognito mode in your browser and inspect the element you want to scrape:

The DevTools inspection of the HTML elements to scrape data with

Assume you want to extract the product title, tagline, and description.

You can scrape all of these with the following code:

headline_element = driver.find_element(By.CSS_SELECTOR, "[data-cy=\"headline\"]")

title_element = headline_element.find_element(By.CSS_SELECTOR, "h1")
title = title_element.text

tagline_element = headline_element.find_element(By.CSS_SELECTOR, "h2")
tagline = tagline_element.text

description_element = headline_element.find_element(By.CSS_SELECTOR, "[data-cy=\"description\"]")
description = description_element.text

To make the above code work, you need to import By from Selenium:

from selenium.webdriver.common.by import By

Now, store the scraped data in a Python dictionary:

product = {
  "title": title,
  "tagline": tagline,
  "description": description
}

Finally, export the data to a JSON file:

with open("product.json", "w") as json_file:
  json.dump(product, json_file, indent=4)

Do not forget to import json from the Python standard library:

import json

And there you have it! You just implemented basic Undetected ChromeDriver web scraping logic.

Step #5: Put It All Together

This is the final scraping script:

import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
import json

# Create a Chrome web driver instance
driver = uc.Chrome()

# Connect to the target page
driver.get("https://www.godaddy.com/hosting/wordpress-hosting")

# Scraping logic
headline_element = driver.find_element(By.CSS_SELECTOR, "[data-cy=\"headline\"]")

title_element = headline_element.find_element(By.CSS_SELECTOR, "h1")
title = title_element.text

tagline_element = headline_element.find_element(By.CSS_SELECTOR, "h2")
tagline = tagline_element.text

description_element = headline_element.find_element(By.CSS_SELECTOR, "[data-cy=\"description\"]")
description = description_element.text

# Populate a dictionary with the scraped data
product = {
  "title": title,
  "tagline": tagline,
  "description": description
}

# Export the scraped data to JSON
with open("product.json", "w") as json_file:
  json.dump(product, json_file, indent=4)

# Close the browser and release its resources
driver.quit() 

Execute it with:

python3 scraper.py

Or, on Windows:

python scraper.py

This will open a browser showing the target web page, not the error page as with vanilla Selenium:

a browser showing the target web page

The script will extract data from the page and produce the following product.json file:

{
    "title": "Managed WordPress Hosting",
    "tagline": "Get WordPress hosting — simplified",
    "description": "We make it easier to create, launch, and manage your WordPress site"
}

undetected_chromedriver: Advanced Usage

Now that you know how the library works, you are ready to explore some more advanced scenarios.

Chose a Specific Chrome Version

You can specify a particular version of Chrome for the library to use by setting the version_main argument:

import undetected_chromedriver as uc

# Specify the target version of Chrome
driver = uc.Chrome(version_main=105)

Note that the library also works with other Chromium-based browsers, but that requires some additional tweaking.

with Sytnax

To avoid manually calling the quit() method when you no longer need the driver, you can use the with syntax as shown below:

import undetected_chromedriver as uc

with uc.Chrome() as driver:
    driver.get("<YOUR_URL>")

When the code inside the with block completes, Python will automatically close the browser for you.

Note: This syntax is supported starting from version 3.1.0.

Proxy Integration

The syntax for adding a proxy to Undetected ChromeDriver is similar to regular Selenium. Simply pass your proxy URL to the --proxy-server flag as shown below:

import undetected_chromedriver as uc

proxy_url = "<YOUR_PROXY_URL>"

options = uc.ChromeOptions()
options.add_argument(f"--proxy-server={proxy}")

Note: Chrome does not support authenticated proxies through the --proxy-server flag.

Extended API

undetected_chromedriver extends regular Selenium functionality with some methods, including:

  • WebElement.click_safe(): Use this method if clicking a link causes detection. While it is not guaranteed to work, it offers an alternative approach for safer clicks.
  • WebElement.children(tag=None, recursive=False): This method helps you easily find child elements. For example:
# Get the 6th child (of any tag) within the body, then find all <img> elements recursively
images = body.children()[6].children("img", True)

Limitations of the undetected_chromedriver Python Library

While undetected_chromedriver is a powerful Python library, it does have some known limitations. Here are the most important ones you should be aware of!

IP Blocks

The GitHub page of the library makes it clear: The package does not hide your IP address. So, if you are running a script from a datacenter, chances are high that detection will still occur. Similarly, if your home IP has a poor reputation, you may also be blocked!

The warning about IP blocks on GitHub

To hide your IP, you need to integrate the controlled browser with a proxy server, as demonstrated earlier.

No Support for GUI Navigation

Due to the inner workings of the module, you must browse programmatically using the get() method. Avoid using the browser GUI for manual navigation—interacting with the page using your keyboard or mouse increases the risk of detection!

The same rule applies to handling new tabs. If you need to work with multiple tabs, open a new tab with a blank page using the URL data:, (yes, including the comma), which the driver accepts. After that, proceed with your usual automation logic.

Only by adhering to these guidelines, you can minimize detection and enjoy smoother web scraping sessions.

Limited Support for Headless Mode

Officially, headless mode is not fully supported by the undetected_chromedriver library. However, you can experiment with it using the following syntax:

driver = uc.Chrome(headless=True)

The author announced in the version 3.4.5 changelog that headless mode should work and guarantee bot bypass capability. Yet, it remains unstable. Use this feature with caution and perform thorough testing to ensure it meets your scraping needs.

Stability Issues

As mentioned on the package’s PyPI page, results may vary due to numerous factors. No guarantees are provided, other than continuous efforts to understand and counter detection algorithms.

The alert about unpredictable results on PyPI

This means a script that successfully bypasses Distil, Cloudflare, Imperva, DataDome, or hCaptcha today might fail tomorrow if the anti-bot solutions receive updates:

A CAPTCHA trigger by Undetected ChromeDriver

The image above is a result from a script provided in the official documentation. That demonstrates that even scripts created by the developers of the tool may not always work as expected. In detail, the script triggered a CAPTCHA, which can easily stop your automation logic.

Find out more in our guide on how to bypass CAPTCHAs in Python.

Further Reading

Undetected ChromeDriver is not the only library that modifies browser drivers to prevent detection. If you are interested in exploring similar tools or learning more about this topic, read these guides:

Conclusion

In this article, you understood how to deal with bot detection in Selenium using Undetected ChromeDriver. This library provides a patched version of ChromeDriver for web scraping without getting blocked.

The problem is that advanced anti-bot technologies like Cloudflare will still be able to detect and block your scripts. Libraries like undetected_chromedriver are unstable—while they may work today, they might not work tomorrow.

The problem does not lie with Selenium’s API for controlling a browser, but with the browser’s settings itself. This implies that the solution is a cloud-based, always-updated, scalable browser with built-in anti-bot bypass functionality. That browser exists, and it is called Scraping Browser!

Bright Data’s Scraping Browser is a highly scalable cloud browser that works with SeleniumPuppeteerPlaywright, and more. It can handle browser fingerprinting, CAPTCHA resolution, and automated retries for you. Also, it automatically rotates the exit IP at each request. That is possible thanks to the worldwide proxy network that includes:

Create a free Bright Data account today to try out our scraping browser or test our proxies.

No credit card required