In this Puppeteer
user agent guide, you will see:
- Why setting the
User-Agent
header for web scraping is crucial - What the default user agent looks like in
Puppeteer
- How to override the default Chrome headless user agent
- How to implement user agent rotation in
Puppeteer
- How to use
Puppeteer Extra
to anonymize the user agent
Let’s dive in!
Why You Need to Set a Custom User Agent
The User-Agent
header is a string that the client sets to identify itself to a server when contacting it via an HTTP request. It typically includes information about the machine and/or application from which the request originates. This header is set by web browsers, HTTP clients, or any software that performs web requests.
Below is an example of the user agent string set by Chrome when requesting web pages:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36
The above user agent string consists of the following components:
Mozilla/5.0
: Originally used to denote compatibility with Mozilla browsers, this prefix is now included for broader compatibility.Windows NT 10.0; Win64; x64
: Indicates the operating system (Windows NT 10.0
), platform (Win64
), and system architecture (x64
).AppleWebKit/537.36
: Refers to the browser engine that Chrome uses.(KHTML, like Gecko)
: Indicates compatibility with the KHTML and Gecko layout engines.Chrome/127.0.0.0
: Specifies the browser name and version.Safari/537.36
: Signals compatibility with Safari.
Essentially, the user agent string can reveal whether the request is coming from a well-known browser or another type of software.
The error most web scraping bots and automation scripts make is to use default or non-browser user agents. These values are easy to be detected by anti-bot measures designed to protect web pages. By analyzing the User-Agent
header, servers can determine if the request may be from an automated bot.
For more details, check out our guide on user agents for web scraping.
What Is the Default Puppeteer User Agent?
Puppeteer
automates browser tasks by controlling a special version of a real-world web browser. By default, it runs a specific version of Chrome, even though it also supports Firefox. Thus, you might assume that the default user agent in Puppeteer
would match the one set by the controlled version of Chrome. Well, that is not the case…
The reason is that Puppeteer
launches the browser in headless mode by default. When browsers operate in headless mode, they generally set a distinctive user agent. As of the current version, the default Puppeteer
user agent looks like this:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/127.0.0.0 Safari/537.3
Note the HeadlessChrome
string, which identifies Chrome as running in headless mode. No surprise, the above string is exactly the user agent of the latest version of headless Chrome.
To confirm that the above string is indeed the Puppeteer
default user agent, set up a script and navigate to the httpbin.io/user-agent page. This API endpoint returns the User-Agent
header of the request, helping you discover the user agent used by any browser or HTTP client.
Create a Puppeteer
script, visit the desired page, retrieve the API response from the body, and print it:
import puppeteer from "puppeteer";
(async () => {
// launch the browser and open a new page
const browser = await puppeteer.launch();
const page = await browser.newPage();
// connect to the target page
await page.goto("https://httpbin.io/user-agent");
// extract the body text with the API response
// and print it
const bodyText = await page.evaluate(() => {
return document.body.innerText;
});
console.log(bodyText);
// close the browser and release its resources
await browser.close();
})();
To learn more about the Puppeteer
API, read our guide on web scraping with Puppeteer
.
Run the Node.js code above, and you will receive the following string:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/127.0.0.0 Safari/537.36"
}
Note that the user agent set by Puppeteer
matches the string presented earlier.
The problem lies in the HeadlessChrome
identifier, which can alert anti-bot systems
. These systems analyze incoming requests for patterns that might indicate bot activity, such as unusual user agent strings. Suspicious requests are flagged and blocked accordingly. Here is why modifying the default user agent Puppeteer
string is so crucial!
How to Change the Puppeteer User Agent
Changing the user agent is such a common and useful operation that Puppeteer
provides a method specifically for that. In particular, the Page
class exposes the setUserAgent()
method. This allows you to modify the User-Agent
set by Puppeteer
when navigating to web pages in that browser tab.
Use setUserAgent()
to change the Puppeteer
user agent as below:
await page.setUserAgent("<your_user_agent>");
All HTTP GET requests made by calling the goto()
method on page
will now have a custom User-Agent
header. Keep in mind that this change applies only to the specific page
object. If you open a new page and interact with it, Puppeteer
will use the default user agent seen earlier.
For a complete example, take a look at the following snippet:
import puppeteer from "puppeteer";
(async () => {
// launch the browser and open a new page
const browser = await puppeteer.launch();
const page = await browser.newPage();
// set a custom user agent
await page.setUserAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36");
// connect to the target page
await page.goto("https://httpbin.io/user-agent");
// extract the body text with the API response
// and print it
const bodyText = await page.evaluate(() => {
return document.body.innerText;
});
console.log(bodyText);
// close the browser and release its resources
await browser.close();
})();
Execute the above script, and this time the result will be:
{
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36"
}
Wonderful! The user agent string extracted from the page matches the user agent configured in the code. You now know how to perform the Puppeteer
change user agent operation.
Implement User Agent Rotation in Puppeteer
Replacing the User-Agent
header of a headless browser with one from a non-headless browser may not be enough to elude anti-bot systems. The problem is that your browser automation script will exhibit patterns that indicate non-human behavior. That is particularly true if you send a large number of requests with the same headers from the same IP
address.
To minimize the risk of bot detection in Puppeteer
, you need to make your requests as varied as possible. An effective method is to set a different user agent for each request. This strategy is called user agent rotation and helps you reduce the likelihood of getting flagged as a bot.
In the following tutorial section, you will learn how to implement user agent rotation in Puppeteer
!
Step #1: Generate a Random User Agent
There are two main approaches to getting a random user agent. The first is to get a list of valid user agents and randomly select one from the list, as shown in our Node.js user agent guide.
The second approach is to use a third-party user agent generator library. The most popular in JavaScript is user-agents, a package that can generate valid user agents for you. In this guide, we will follow this method!
Launch this npm command to add the user-agents
library to your project’s dependencies:
npm install user-agents
Note that user-agents
is updated daily, ensuring it always includes the latest user agents.
After installing the library, import the UserAgent
object into your Puppeteer
script:
import UserAgent from "user-agents";
You can now generate a random user agent string with the following code:
const userAgent = new UserAgent().random().toString();
Check out the official documentation for more advanced usage, such as getting user agents for specific devices, operating systems, and more.
Step #2: Set the Random User Agent
Pass the random user agent string to Puppeteer
by calling the setUserAgent()
method:
const userAgent = new UserAgent().random().toString();
await page.setUserAgent(userAgent);
All requests performed via the page
object will now have a custom user agent, generated randomly.
Step #3: Visit the Target Page
Call the goto()
method on page
to visit your target webpage in the controlled headless browser:
await page.goto("https://httpbin.io/user-agent");
Step #4: Put It All Together
Below is your final Puppeteer
user agent rotation script:
import puppeteer from "puppeteer";
import UserAgent from "user-agents";
(async () => {
// launch the browser and open a new page
const browser = await puppeteer.launch();
const page = await browser.newPage();
// generate a random user agent
const userAgent = new UserAgent().random().toString();
// set a random user agent
await page.setUserAgent(userAgent);
// connect to the target page
await page.goto("https://httpbin.io/user-agent");
// extract the body text with the API response
// and print it
const bodyText = await page.evaluate(() => {
return document.body.innerText;
});
console.log(bodyText);
// close the browser and release its resources
await browser.close();
})();
Run the script a few times, and you should see different user agents.
Et voilà! The user agent rotation logic in Puppeteer
works like a charm.
Set a Custom User Agent in Puppeteer With puppeteer-extra-plugin-anonymize-ua
The above approaches to Puppeteer
user agent setting are effective, but they have a significant drawback. The userAgent()
method only changes the user agent for a specific page session, not across all browser tabs.
To ensure Puppeteer
never uses the default user agent, you can use the puppeteer-extra-plugin-anonymize-ua
plugin from Puppeteer Extra
. If you are not familiar with this project, Puppeteer Extra
extends Puppeteer
by adding support for community-defined plugins. You can discover more in our guide on the Puppeteer Extra Stealth plugin
.
The puppeteer-extra-plugin-anonymize-ua
plugin can anonymize the user agent set by Puppeteer
. To install puppeteer-extra
and the necessary plugin, run the following command:
npm install puppeteer-extra puppeteer-extra-plugin-anonymize-ua
Next, import puppeteer
from puppeteer-extra
and AnonymizeUAPlugin
from puppeteer-extra-plugin-anonymize-ua
:
import puppeteer from "puppeteer-extra";
import AnonymizeUAPlugin from "puppeteer-extra-plugin-anonymize-ua";
Configure puppeteer-extra-plugin-anonymize-ua
to generate a random user agent and register it as a plugin with the use()
method from Puppeteer Extra
:
puppeteer.use(
AnonymizeUAPlugin({
customFn: () => new UserAgent().random().toString(),
})
);
Now, try visiting pages in two different tabs using two separate page
objects:
import puppeteer from "puppeteer-extra";
import AnonymizeUAPlugin from "puppeteer-extra-plugin-anonymize-ua";
import UserAgent from "user-agents";
(async () => {
// configure and register the
// puppeteer-extra-plugin-anonymize-ua plugin
puppeteer.use(
AnonymizeUAPlugin({
customFn: () => new UserAgent().random().toString(),
})
);
// launch the browser and open a new page
const browser = await puppeteer.launch();
// open a new page
const page1 = await browser.newPage();
// connect to the target page
await page1.goto("https://httpbin.io/user-agent");
// extract the body text with the API response
// and print it
const bodyText1 = await page1.evaluate(() => {
return document.body.innerText;
});
console.log(bodyText1);
// open a new page
const page2 = await browser.newPage();
// connect to the target page
await page2.goto("https://httpbin.io/user-agent");
// extract the body text with the API response
// and print it
const bodyText2 = await page2.evaluate(() => {
return document.body.innerText;
});
console.log(bodyText2);
// close the browser and release its resources
await browser.close();
})();
The result will be two different user agents, as shown below:
{
"user-agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Mobile/15E148 Safari/604.1"
}
{
"user-agent": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36"
}
The two user agents are different, and neither is the default Puppeteer
user agent. That occurs because puppeteer-extra-plugin-anonymize-ua
customizes each page
object with a random user agent as specified in the customFn
function. This way, Puppeteer
will never expose its default user agent!
Conclusion
In this article, you explored the importance of setting the User-Agent
header and saw what the default Puppeteer
user agent looks like. You learned how to override that value and implement user agent rotation to elude basic anti-scraping systems. However, more sophisticated systems can still detect and block your automated requests. To prevent IP
bans, you can configure a proxy in Puppeteer
, but even that may not always be enough!
For a more effective solution, try Scraping Browser—a next-generation browser that integrates with Puppeteer
and any other browser automation tool. Scraping Browser
can effortlessly bypass anti-bot technologies for you while avoiding browser fingerprinting. Under the hood, it relies on features like user agent rotation, IP rotation, and CAPTCHA
solving. Browser automation has never been easier!
Sign up now and start your tree trial.
No credit card required