Puppeteer User Agent Guide: Setting and Changing

Master the techniques for setting and rotating user agents in Puppeteer to enhance your web scraping efforts and bypass anti-bot defenses.
10 min read
Puppeteer
User Agent Guide blog image

In this Puppeteer user agent guide, you will see:

  • Why setting the User-Agent header for web scraping is crucial
  • What the default user agent looks like in Puppeteer
  • How to override the default Chrome headless user agent
  • How to implement user agent rotation in Puppeteer
  • How to use Puppeteer Extra to anonymize the user agent

Let’s dive in!

Why You Need to Set a Custom User Agent

The User-Agent header is a string that the client sets to identify itself to a server when contacting it via an HTTP request. It typically includes information about the machine and/or application from which the request originates. This header is set by web browsers, HTTP clients, or any software that performs web requests.

Below is an example of the user agent string set by Chrome when requesting web pages:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36

The above user agent string consists of the following components:

  • Mozilla/5.0: Originally used to denote compatibility with Mozilla browsers, this prefix is now included for broader compatibility.
  • Windows NT 10.0; Win64; x64: Indicates the operating system (Windows NT 10.0), platform (Win64), and system architecture (x64).
  • AppleWebKit/537.36: Refers to the browser engine that Chrome uses.
  • (KHTML, like Gecko): Indicates compatibility with the KHTML and Gecko layout engines.
  • Chrome/127.0.0.0: Specifies the browser name and version.
  • Safari/537.36: Signals compatibility with Safari.

Essentially, the user agent string can reveal whether the request is coming from a well-known browser or another type of software.

The error most web scraping bots and automation scripts make is to use default or non-browser user agents. These values are easy to be detected by anti-bot measures designed to protect web pages. By analyzing the User-Agent header, servers can determine if the request may be from an automated bot.

For more details, check out our guide on user agents for web scraping.

What Is the Default Puppeteer User Agent?

Puppeteer automates browser tasks by controlling a special version of a real-world web browser. By default, it runs a specific version of Chrome, even though it also supports Firefox. Thus, you might assume that the default user agent in Puppeteer would match the one set by the controlled version of Chrome. Well, that is not the case…

The reason is that Puppeteer launches the browser in headless mode by default. When browsers operate in headless mode, they generally set a distinctive user agent. As of the current version, the default Puppeteer user agent looks like this:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/127.0.0.0 Safari/537.3

Note the HeadlessChrome string, which identifies Chrome as running in headless mode. No surprise, the above string is exactly the user agent of the latest version of headless Chrome.

To confirm that the above string is indeed the Puppeteer default user agent, set up a script and navigate to the httpbin.io/user-agent page. This API endpoint returns the User-Agent header of the request, helping you discover the user agent used by any browser or HTTP client.

Create a Puppeteer script, visit the desired page, retrieve the API response from the body, and print it:

import puppeteer from "puppeteer";

(async () => {

// launch the browser and open a new page

const browser = await puppeteer.launch();

const page = await browser.newPage();

// connect to the target page

await page.goto("https://httpbin.io/user-agent");

// extract the body text with the API response

// and print it

const bodyText = await page.evaluate(() => {

return document.body.innerText;

});

console.log(bodyText);

// close the browser and release its resources

await browser.close();

})();

To learn more about the Puppeteer API, read our guide on web scraping with Puppeteer.

Run the Node.js code above, and you will receive the following string:

{

"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/127.0.0.0 Safari/537.36"

}

Note that the user agent set by Puppeteer matches the string presented earlier.

The problem lies in the HeadlessChrome identifier, which can alert anti-bot systems. These systems analyze incoming requests for patterns that might indicate bot activity, such as unusual user agent strings. Suspicious requests are flagged and blocked accordingly. Here is why modifying the default user agent Puppeteer string is so crucial!

How to Change the Puppeteer User Agent

Changing the user agent is such a common and useful operation that Puppeteer provides a method specifically for that. In particular, the Page class exposes the setUserAgent() method. This allows you to modify the User-Agent set by Puppeteer when navigating to web pages in that browser tab.

Use setUserAgent() to change the Puppeteer user agent as below:

await page.setUserAgent("<your_user_agent>");

All HTTP GET requests made by calling the goto() method on page will now have a custom User-Agent header. Keep in mind that this change applies only to the specific page object. If you open a new page and interact with it, Puppeteer will use the default user agent seen earlier.

For a complete example, take a look at the following snippet:

import puppeteer from "puppeteer";

(async () => {

// launch the browser and open a new page

const browser = await puppeteer.launch();

const page = await browser.newPage();

// set a custom user agent

await page.setUserAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36");

// connect to the target page

await page.goto("https://httpbin.io/user-agent");

// extract the body text with the API response

// and print it

const bodyText = await page.evaluate(() => {

return document.body.innerText;

});

console.log(bodyText);

// close the browser and release its resources

await browser.close();

})();

Execute the above script, and this time the result will be:

{

"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36"

}

Wonderful! The user agent string extracted from the page matches the user agent configured in the code. You now know how to perform the Puppeteer change user agent operation.

Implement User Agent Rotation in Puppeteer

Replacing the User-Agent header of a headless browser with one from a non-headless browser may not be enough to elude anti-bot systems. The problem is that your browser automation script will exhibit patterns that indicate non-human behavior. That is particularly true if you send a large number of requests with the same headers from the same IP address.

To minimize the risk of bot detection in Puppeteer, you need to make your requests as varied as possible. An effective method is to set a different user agent for each request. This strategy is called user agent rotation and helps you reduce the likelihood of getting flagged as a bot.

In the following tutorial section, you will learn how to implement user agent rotation in Puppeteer!

Step #1: Generate a Random User Agent

There are two main approaches to getting a random user agent. The first is to get a list of valid user agents and randomly select one from the list, as shown in our Node.js user agent guide.

The second approach is to use a third-party user agent generator library. The most popular in JavaScript is user-agents, a package that can generate valid user agents for you. In this guide, we will follow this method!

Launch this npm command to add the user-agents library to your project’s dependencies:

npm install user-agents

Note that user-agents is updated daily, ensuring it always includes the latest user agents.

After installing the library, import the UserAgent object into your Puppeteer script:

import UserAgent from "user-agents";

You can now generate a random user agent string with the following code:

const userAgent = new UserAgent().random().toString();

Check out the official documentation for more advanced usage, such as getting user agents for specific devices, operating systems, and more.

Step #2: Set the Random User Agent

Pass the random user agent string to Puppeteer by calling the setUserAgent() method:

const userAgent = new UserAgent().random().toString();

await page.setUserAgent(userAgent);

All requests performed via the page object will now have a custom user agent, generated randomly.

Step #3: Visit the Target Page

Call the goto() method on page to visit your target webpage in the controlled headless browser:

await page.goto("https://httpbin.io/user-agent");

Step #4: Put It All Together

Below is your final Puppeteer user agent rotation script:

import puppeteer from "puppeteer";

import UserAgent from "user-agents";

(async () => {

// launch the browser and open a new page

const browser = await puppeteer.launch();

const page = await browser.newPage();

// generate a random user agent

const userAgent = new UserAgent().random().toString();

// set a random user agent

await page.setUserAgent(userAgent);

// connect to the target page

await page.goto("https://httpbin.io/user-agent");

// extract the body text with the API response

// and print it

const bodyText = await page.evaluate(() => {

return document.body.innerText;

});

console.log(bodyText);

// close the browser and release its resources

await browser.close();

})();

Run the script a few times, and you should see different user agents.

Et voilà! The user agent rotation logic in Puppeteer works like a charm.

Set a Custom User Agent in Puppeteer With puppeteer-extra-plugin-anonymize-ua

The above approaches to Puppeteer user agent setting are effective, but they have a significant drawback. The userAgent() method only changes the user agent for a specific page session, not across all browser tabs.

To ensure Puppeteer never uses the default user agent, you can use the puppeteer-extra-plugin-anonymize-ua plugin from Puppeteer Extra. If you are not familiar with this project, Puppeteer Extra extends Puppeteer by adding support for community-defined plugins. You can discover more in our guide on the Puppeteer Extra Stealth plugin.

The puppeteer-extra-plugin-anonymize-ua plugin can anonymize the user agent set by Puppeteer. To install puppeteer-extra and the necessary plugin, run the following command:

npm install puppeteer-extra puppeteer-extra-plugin-anonymize-ua

Next, import puppeteer from puppeteer-extra and AnonymizeUAPlugin from puppeteer-extra-plugin-anonymize-ua:

import puppeteer from "puppeteer-extra";

import AnonymizeUAPlugin from "puppeteer-extra-plugin-anonymize-ua";

Configure puppeteer-extra-plugin-anonymize-ua to generate a random user agent and register it as a plugin with the use() method from Puppeteer Extra:

puppeteer.use(

AnonymizeUAPlugin({

customFn: () => new UserAgent().random().toString(),

})

);

Now, try visiting pages in two different tabs using two separate page objects:

import puppeteer from "puppeteer-extra";

import AnonymizeUAPlugin from "puppeteer-extra-plugin-anonymize-ua";

import UserAgent from "user-agents";

(async () => {

// configure and register the

// puppeteer-extra-plugin-anonymize-ua plugin

puppeteer.use(

AnonymizeUAPlugin({

customFn: () => new UserAgent().random().toString(),

})

);

// launch the browser and open a new page

const browser = await puppeteer.launch();

// open a new page

const page1 = await browser.newPage();

// connect to the target page

await page1.goto("https://httpbin.io/user-agent");

// extract the body text with the API response

// and print it

const bodyText1 = await page1.evaluate(() => {

return document.body.innerText;

});

console.log(bodyText1);

// open a new page

const page2 = await browser.newPage();

// connect to the target page

await page2.goto("https://httpbin.io/user-agent");

// extract the body text with the API response

// and print it

const bodyText2 = await page2.evaluate(() => {

return document.body.innerText;

});

console.log(bodyText2);

// close the browser and release its resources

await browser.close();

})();

The result will be two different user agents, as shown below:

{

"user-agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 17_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Mobile/15E148 Safari/604.1"

}

{

"user-agent": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36"

}

The two user agents are different, and neither is the default Puppeteer user agent. That occurs because puppeteer-extra-plugin-anonymize-ua customizes each page object with a random user agent as specified in the customFn function. This way, Puppeteer will never expose its default user agent!

Conclusion

In this article, you explored the importance of setting the User-Agent header and saw what the default Puppeteer user agent looks like. You learned how to override that value and implement user agent rotation to elude basic anti-scraping systems. However, more sophisticated systems can still detect and block your automated requests. To prevent IP bans, you can configure a proxy in Puppeteer, but even that may not always be enough!

For a more effective solution, try Scraping Browser—a next-generation browser that integrates with Puppeteer and any other browser automation tool. Scraping Browser can effortlessly bypass anti-bot technologies for you while avoiding browser fingerprinting. Under the hood, it relies on features like user agent rotation, IP rotation, and CAPTCHA solving. Browser automation has never been easier!

Sign up now and start your tree trial.

No credit card required