How to Bypass CAPTCHAs With Puppeteer

Learn to bypass CAPTCHAs with Puppeteer using stealth plugins and advanced techniques for seamless automation.
8 min read
How to bypass CAPTCHAs with Puppeteer blog image

In this guide, you will learn:

  • What CAPTCHAs are and whether you can bypass them
  • How to use Puppeteer to bypass CAPTCHAs via a step-by-step tutorial
  • What to do if the process with Puppeteer does not work

Let’s dive in!

What Are CAPTCHAs? And Can You Bypass Them?

CAPTCHA (Completely Automated Public Turing tests to tell Computers and Humans Apart) is a challenge-response test that distinguishes humans from automated bots. To accomplish their goals, CAPTCHAs are designed to be easily solved by humans while difficult to software.

Popular CAPTCHA providers include Google reCAPTCHAhCaptcha, and BotDetect and common CAPTCHA types are:

  • Text-based: In these challenges, users have to recognize letters and numbers and type them.
    Image-based: These tests require users to identify specific objects in a grid of images by selecting the right images.
  • Audio-based: In this type, users have to write the letters they hear.
  • Puzzle challenges: This type of challenge requires users to solve a simple puzzle by sliding a piece into the dedicated place.

CAPTCHAs are designed so that they hard to be bypassed by automated software and bots. So, what you can do is integrate your software with CAPTCHA-solving libraries or services that rely on human operators to automate these challenges and solve them.

However, hard-coded CAPTCHAs are not common because they have a negative impact on the overall user experience on the website. For this reason, it is more common that CAPTCHAs are used as parts of broader anti-bot solutions, such as WAFs (Web Application Firewalls):

A CAPTCHA on G2 generated by Cloudflare

In these cases, the system dynamically displays a CAPTCHA when there is the suspect that a bot is doing some activity on the website. To bypass these CAPTCHAs, you have to develop a bot that mimics human behavior. While this can be done, it requires a lot of effort, particularly because you need to update your scripts frequently to stay ahead of new bot detection techniques and methods.

The good news is that there is a more effective solution to bypass CAPTCHAs: Bright Data’s CAPTCHA Solver! This always up-to-date tool solves all your problems related to bypassing CAPTCHAs without any headaches.

How to Bypass CAPTCHAs With Puppeteer: Step-By-Step Tutorial

Now it is time to create an automated script that mimics human behavior to bypass CAPTCHAs.

To do so, you can use Puppeteer: a JavaScript library that provides a high-level API that controls web browsers and, thus, can be used to mimic human behaviors.

Let’s get started!

Step #1: Project Setup

Suppose you call the main folder of your project bypass_captcha_puppeteer. Here is the structure the repository should have:

bypass_captcha_puppeteer/
├── index.js
└── package.json

You can create with:

mkdir bypass_captcha_puppeteer

Then, enter the project folder and launch npm init to initialize a Node.js application:

cd bypass_captcha_puppeteer
npm init -y

Next, create an index.js file inside it.

Install Puppeteer as below:

npm install puppeteer

Step #2: Use ESM Javascript Notation

To use ECMAScript Modules notation in Javascript, the package.json file must have the "type": "module" option.

Here is how the package.json file should look like:

{
  "name": "bypass_captcha_puppeteer",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "type": "module",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "puppeteer": "^23.10.4"
  }
}

Step #3: Try To Bypass CAPTCHA With Puppeteer

Write the following code into the index.js file to see whether Puppeteer appears as a bot or not:

import puppeteer from 'puppeteer';

const visitBotAnalyzerPage = async () => {
  try {
    // initialize the browser
    const browser = await puppeteer.launch();

    // open a new browser page
    const page = await browser.newPage();

    // navigate to the target URL
    const url = 'https://bot.sannysoft.com/';
    console.log(`Navigating to ${url}...`);
    await page.goto(url, { waitUntil: 'networkidle2' });
    
    // save a full-page screenshot
    console.log('Taking full-page screenshot...');
    await page.screenshot({ path: 'anti-bot-analysis.png', fullPage: true });
    console.log('Screenshot taken');
    
    // close the browser
    await browser.close();
    console.log('Browser closed');
  } catch (error) {
    console.error('An error occurred:', error);
  }
};

// run the script
visitBotAnalyzerPage();

Here is what this code does:

  1. Launches the browser: The puppeteer.launch() method starts a new browser instance with visible UI (headless: false).
  2. Opens a new browser pagebrowser.newPage() creates a new blank browser page where further actions can be performed.
  3. Goes to target page: The method page.goto() redirects to the target page, which is Intoli.com tests, a page designed to understand whether a request comes from a bot or not.
  4. Saves a screenshot of the results: The method page.screenshot() get a screenshot of the results and save it.
  5. Closes browser and handles errors: The browser.close() closes the browser and intercepts eventual errors.

To run the code, type:

node index.js

You can now open the saved image. This is the expected result:

The expected result

So, Puppeteer has not passed a few tests, as the image shows. Consequently, WAFs will be likely to show CAPTCHAs when interacting with pages with Puppeteer.

To solve these issues, let’s use the Puppeteer Stealth!

Step #4: Install The Stealth Plugin

Puppeteer Extra is a lightweight wrapper around Puppeteer that, among other things, allows you to install the Stealth plugin that prevents bot detection by overriding several configurations to make the browser instance appear to be natural and “human-like.”

Install these libraries like so:

npm install puppeteer-extra puppeteer-extra-plugin-stealth

Import Puppeteer from puppeteer-extra instead of puppeteer:

import puppeteer from 'puppeteer-extra';

Fantastic! You are ready to use the Stealth Plugin to try to avoid CAPCHAs with Puppeteer.

Step #5: Repeat the Test With the Stealth Plugin

Now you have to implement the Stealth plugin with this line of code:

puppeteer.use(StealthPlugin()).

So, the code becomes:

import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';

// Add the stealth plugin to Puppeteer
puppeteer.use(StealthPlugin());

const visitBotAnalyzerPage = async () => {
  try {
    // launch the browser with stealth settings
    const browser = await puppeteer.launch();
    console.log('Launching browser in stealth mode...');
    
    // open a new page
    const page = await browser.newPage();

    // navigate to the target page
    const url = 'https://bot.sannysoft.com/';
    console.log(`Navigating to ${url}...`);
    await page.goto(url, { waitUntil: 'networkidle2' });

    // save the screenshot of the entire page
    console.log('Taking full-page screenshot...');
    await page.screenshot({ path: 'anti-bot-analysis.png', fullPage: true });
    console.log(`Screenshot taken`);

    // close the browser
    await browser.close();
    console.log('Browser closed. Script completed successfully');
  } catch (error) {
    console.error('Error occurred:', error);
  }
};

// run the script
visitBotAnalyzerPage();

Now, when you run the code again with:

node index.js

The expected result is:

The final expected result after running the code

Hooray! The script now passes the bot detection tests, which means you are less likely to receive CAPTCHAs with Puppeteer!

What To Do The Above Procedure to Bypass CAPTCHAs With Puppeteer Does Not Work

Unfortunately, Puppeteer Extra is not always gold. The reason is that browser settings are not the only way anti-bots focus their attention on blocking automated software.

For example, user agent is another factor used by anti-bot systems to block automated software. To solve this issue, you can use the library puppeteer-extra-plugin-anonymize-ua which anonymizes the user agent.

However, the approach based on plugins described before works only against basic anti-bot measures: when dealing with more complex tools like Cloudflare, you need something more powerful.

So…Are you looking for a real Playwright CAPTCHA solver? Try Bright Data web scraping solutions!

These provide superior unlocking capabilities with a dedicated CAPTCHA-solving feature to automatically handle reCAPTCHAhCaptchapx_captchaSimpleCaptchaGeeTest CAPTCHAFunCaptchaCloudflare TurnstileAWS WAF CaptchaKeyCAPTCHA, and many others.

Integrating Bright Data’s CAPTCHA Solver into your scripts is easy, as it works with any HTTP client or browser automation tool.

Find out more about how to use Bright Data’s CAPTCHA Solver and check out the documentation for all integration and configuration details.

Conclusion

In this article, you learned why bypassing CAPTCHAs with Puppeteer can be challenging, and how to use the Stealth plugin to override the default browser configuration to circumvent bot detection.

The problem with that approach is that it works only in simple scenarios. Advanced bot detection systems can still identify you as a bot and block you.

So, when bypassing CAPTCHAs, the actual solution is to connect to your target page through an unlocking API that can seamlessly return the CAPTCHA-free HTML of any web page. This solution exists and is called Web Unlocker. Its goal is to automatically rotate the exit IP with each request via proxy integration, handle browser fingerprinting, automatic retries, and CAPTCHA resolution for you.

Sign up now to discover which of Bright Data’s scraping products best suit your needs.

Start with a free trial!

No credit card required