In this tutorial, you will learn:
- Why the
User-Agent
header is so important - The default Selenium user agent value in both headed and headless browsers
- How to change the user agent in Selenium
- How to implement user agent rotation in Selenium
Let’s dive in!
Why Is the User Agent Header Important?
The User-Agent
header is a string that identifies the client software making the HTTP request. It usually includes information about the browser or application type, operating system, and architecture the request comes from. This is generally set by browsers, HTTP clients, or any other application performing web requests.
For instance, below is the user agent set by Chrome as of this writing:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36
The components of this user agent string are:
Mozilla/5.0
: Historically used to indicate compatibility with Mozilla browsers. It now represents a prefix added for compatibility reasons.Windows NT 10.0; Win64; x64
: Operating system (Windows NT 10.0), platform (Win64), and architecture (x64).AppleWebKit/537.36
: Browser engine Chrome relies on.KHTML, like Gecko
: Compatibility with the KHTML engine and Gecko layout engine used by Mozilla.Chrome/125.0.0.0
: Browser name and version.Safari/537.36
: Compatibility with Safari.
Simply put, the user agent identifies whether the request comes from a known browser or another type of software.
Scraping bots and browser automation scripts tend to use default or inconsistent user agent strings. These reveal their automated nature in the eyes of anti-scraping solutions, which protect web page data by monitoring incoming requests. By looking at the User-Agent
header, they can determine whether the current user is legitimate or a bot.
For more details, read our guide on user agents for web scraping.
What Is the Default Selenium User Agent?
The User-Agent
header set by Selenium when making the HTTP GET request to retrieve a web page depends on the browser under control and whether it is in headed or headless mode.
Note: In this article, we will use Selenium in Python and configure it to operate on Chrome. However, you can easily extend what you will learn here to different programming languages and browsers.
To see the Selenium user agent string, create a basic browser automation script that visits the httpbin.io /user-agent
page. This is nothing more than an API that returns the User-Agent
header of the incoming request.
Import selenium
, initialize a Chrome instance, visit the desired page, and print its content:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
# enable headless mode in Selenium
options = Options()
# options.add_argument('--headless')
# initialize a Chrome instance
driver = webdriver.Chrome(
options=options,
)
# visit the desired page
driver.get("https://httpbin.org/user-agent")
# get the page content
user_agent_info = driver.find_element(By.TAG_NAME, "body").text
# print the page content
print(user_agent_info)
# close the browser
driver.quit()
Launch the above Python script, and it will log in the terminal something like:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
}
The value corresponds to the User-Agent
header set by Chrome at the time of writing. That should not surprise you, as Selenium operates on a real browser window.
At the same time, Selenium is typically configured to control headless browser instances. The reason is that loading the UI of a browser takes a lot of resources and adds no benefit in production. So, uncomment the --headless
option to run the script in headless mode. This time, the result will be:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/125.0.6422.142 Safari/537.36"
}
As you can see, Chrome/125.0.0.0
was replaced by HeadlessChrome/125.0.6422.142
. This value clearly identifies the request as coming from a browser automation tool, since no human user would ever use a headless browser. The consequence is that anti-bot systems can mark such a request as coming from a bot and block it. Here is why it is so crucial to set the Selenium user agent value!
Find out more information in our guide on Selenium web scraping.
How To Change User Agent in Selenium
Selenium provides two ways to set the user agent value. Let’s dig into them both!
Set the User Agent Globally
Among the options supported by Chrome, there is also the --user-agent
flag. This allows you to specify the global user agent the Chrome process should use when visiting web pages in its tabs or windows.
Set a global user agent in Selenium with Python as below:
custom_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
options = Options()
# set a custom user agent in the browser option
options.add_argument(f'--user-agent={custom_user_agent}')
# other options...
# initialize a Chrome instance with a custom user agent
driver = webdriver.Chrome(
options=options,
)
Put it all together and verify that it works with the following script:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
custom_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
options = Options()
# set a custom user agent in the browser option
options.add_argument(f'--user-agent={custom_user_agent}')
# enable headless mode
options.add_argument('--headless')
# initialize a headless Chrome instance with a custom user agent
driver = webdriver.Chrome(
options=options,
)
# visit the desired page
driver.get("https://httpbin.org/user-agent")
# get the page content
user_agent_info = driver.find_element(By.TAG_NAME, "body").text
# print the page content
print(user_agent_info)
# close the browser
driver.quit()
Now, launch the script and it will print:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
}
This matches the user agent specified in the custom_user_agent
string. In particular, the browser controlled via Selenium now exposes the user agent value of a headed browser even if it is in headless mode. That trick should be enough to trick less complex anti-bot solutions.
The main disadvantage with this approach is that you can only set the --user-agent
flag once, during the browser instance setup. Once specified, the custom user agent will be used in the entire browsing session, with no possibility to change it on the fly before a get()
call.
Set the User Agent Locally
Chrome Devtools Protocol (CDP) commands let you communicate with a running Chrome browser. In particular, they give you the ability to dynamically change default values and configurations set by the browser.
You can execute a CDP command in Selenium by using the execute_cdp_cmd() method exposed by the driver
object. Specifically, the Network.setUserAgentOverride
CDP command overrides the user agent with the given string. Use it to locally change user agent in Selenium as below:
custom_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': custom_user_agent})
Verify that this approach gives you the ability to update the user agent multiple times within the same browsing session with the following logic:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
options = Options()
# enable headless mode
options.add_argument('--headless')
# initialize a headless Chrome instance
driver = webdriver.Chrome(
options=options,
)
# configure a custom user agent
custom_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': custom_user_agent})
# visit the desired page
driver.get("https://httpbin.org/user-agent")
# get the page content and print it
user_agent_info = driver.find_element(By.TAG_NAME, "body").text
print(user_agent_info)
# set another user agent
custom_user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:126.0) Gecko/20100101 Firefox/126.0"
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': custom_user_agent})
# reload the page
driver.refresh()
# print the page content
user_agent_info = driver.find_element(By.TAG_NAME, "body").text
print(user_agent_info)
# close the browser
driver.quit()
Launch the above script, and it will produce:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
}
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:126.0) Gecko/20100101 Firefox/126.0"
}
Awesome! Two different Selenium user agent strings within the same browsing session.
Implement User Agent Rotation in Selenium
Setting a non-headless User-Agent
header may not be enough to overcome anti-bots. The problem is that too many requests coming from the same IP address and with the same headers are likely to unveil the automated nature of your Selenium script.
The key to avoiding bot detection is to randomize your requests, such as by implementing user agent rotation. The idea behind this approach is to randomly pick a user agent before navigating to a page in Selenium. That way, your automated requests will appear as coming from different browsers, reducing the risk of triggering blocks and bans.
Now, follow the steps below and learn how to implement user agent rotation in Selenium!
Step #1: Retrieve a List of User Agents
Get some proper user agents from a portal like User Agent String.com and store them in a Python array as follows:
user_agents = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14.5; rv:126.0) Gecko/20100101 Firefox/126.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4.1 Safari/605.1.15"
# other user agents...
]
Step #2: Extract a Random User Agent
Define a custom function to set the random user agent on the Selenium web driver object:
def set_user_agent(driver):
# set the user agent...
Import the random
package from the Python Standard Library to get ready to randomly pick a user agent from the user_agents
list:
import random
Use the random.choice()
function to randomly extract a user agent string from the array:
random_user_agent = random.choice(user_agents)
Then, assign it to the Chrome window with the execute_cdp_cmd()
function:
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': random_user_agent})
Your set_user_agent()
function will now contain:
def set_user_agent(driver):
# randmoly pick a user agent string from the list
random_user_agent = random.choice(user_agents)
# set the user agent in the driver
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': random_user_agent})
Step #3: Set the Random User Agent
Before navigating to a page with get()
, call the set_user_agent()
function to change the Selenium user agent:
# set a custom user agent
set_user_agent(driver)
# visit the desired page
driver.get("https://httpbin.org/user-agent")
Step #4: Put It All Together
This is what your Python Selenium user agent rotation script will look like:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import random
user_agents = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14.5; rv:126.0) Gecko/20100101 Firefox/126.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4.1 Safari/605.1.15"
]
def set_user_agent(driver):
# randmoly pick a user agent string from the list
random_user_agent = random.choice(user_agents)
# set the user agent in the driver
driver.execute_cdp_cmd('Network.setUserAgentOverride', {'userAgent': random_user_agent})
options = Options()
# enable headless mode
options.add_argument('--headless')
# initialize a headless Chrome instance
driver = webdriver.Chrome(
options=options,
)
# set a custom user agent
set_user_agent(driver)
# visit the desired page
driver.get("https://httpbin.org/user-agent")
# get the page content and print it
user_agent_info = driver.find_element(By.TAG_NAME, "body").text
print(user_agent_info)
# close the browser
driver.quit()
Execute this script a few times and notice that it will print different user agent strings.
Et voilà! You are now a master at changing user agent in Selenium.
Conclusion
In this guide, you learned the importance of the User-Agent
header and how to override it in Selenium. This technique enables you to trick basic anti-bot systems into thinking that your requests come from a legitimate non-headless browser. However, advanced solutions may still be able to detect and block you. To prevent IP bans, you could integrate a proxy with Selenium, but even that might not be enough!
Avoid those issues with Scraping Browser, a next-generation browser that integrates with Selenium and any other browser automation tool. Scraping Browser can effortlessly bypass anti-bot technologies for you while avoiding browser fingerprinting. Under the hood, it relies on features like user agent rotation, IP rotation, and CAPTCHA solving. Browser automation has never been easier!
No credit card required