In this guide, you will learn:
- What a CAPTCHA is and whether it can be avoided
- How to implement Playwright CAPTCHA bypass logic
- What to do in case the CAPTCHA still shows up.
Time to dive in!
What Are CAPTCHAs and Can You Bypass Them?
A CAPTCHA, short for “Completely Automated Public Turing tests to tell Computers and Humans Apart,” is a test used to distinguish between human users and automated bots. It is a challenge specifically designed to be easy for humans but difficult for machines to solve.
Popular CAPTCHA providers include Google reCAPTCHA, hCaptcha, and BotDetect. These usually support one or more of the CAPTCHA types below:
- Image-based challenges: Users must identify specific objects in a grid of images, or identify images that comply with a given assumption.
- Text-based challenges: Users are required to type a sequence of distorted letters and numbers.
- Audio-based challenges: Users are asked to type the words they hear.
- Puzzle challenges: Users must solve a simple puzzle, such as sliding a piece into place.
CAPTCHAs can be part of a particular user flow, such as the final step of submitting a form:
In these cases, the CAPTCHA is always displayed and cannot be really avoided by bots. What you can do is integrate your software with CAPTCHA-solving libraries to automate them or services that rely on human operators to solve these challenges in real-time. However, hard-coded CAPTCHAs are not common because they are annoying and ruin the user experience.
More commonly, CAPTCHAs are used as part of broader anti-bot solutions, such as WAFs (Web Application Firewalls):
These systems dynamically display a CAPTCHA when they suspect the user may be a bot. In these instances, CAPTCHAs can be bypassed by making your bot mimic human behavior. Still, this is a cat-and-mouse game that requires continually updating your automated script to avoid new bot detection measures.
A more effective solution to CAPTCHA bypass is to use a user-emulation-based and always up-to-date tool like Bright Data’s CAPTCHA Solver.
Playwright Bypass CAPTCHA: Step-By-Step Tutorial
As you just learned, an effective approach to avoid CAPTCHAs is to make your automated script simulate human behaviors while using a human-like fingerprint. One of the best tools for this purpose is Playwright, a leading browser automation library that appears in the list of the top web scraping tools of the year.
In this tutorial section, you will see how to implement Playwright CAPTCHA bypassing logic. You will learn how to achieve this goal using a Node.js script in JavaScript. If you are a Python developer, you can take a look at our equivalent guide on Playwright Stealth.
Let’s get started!
Step #1: Initialize Your Node.js Project
If you already have a Playwright web scraping or testing script, you can skip this step. Otherwise, create a folder for your Playwright CAPTCHA solver project and enter it in the terminal:
mkdir playwright_demo
cd playwright_demo
Initialize a new Node.js project inside it with the npm init command below:
npm init -y
Open the project’s folder in your favorite JavaScript IDE and add a new script.js
file.
Then, do not forget to open package.json
and mark your project as a module by adding:
"type": "module"
Wonderful, your project’s folder now contains a Node.js application.
Step #2: Install Playwright Extra
One of the known weaknesses of Playwright is that it does not support plugins. The community has made up for this shortcoming with Playwright Extra, a library that extends playwright
with plugin support.
Add playwright
and playwright-extra
to your project’s dependencies with this command:
npm i playwright playwright-extra
This may take a while, so be patient.
Step #3: Set Up Your Playwright Script
It is time to initialize your script to let Playwright solve CAPTCHA challenges. Import the browser you want to control from playwright-extra
by adding this line to script.js
:
import { chromium } from "playwright-extra"
In this case, we are going to automate human behavior in Chromium.
Then, initialize a new async function where to perform the human-like interaction using the Playwright API:
(async () => {
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// browser automation logic...
// close the browser and release its resources
await browser.close()
})()
This launches a new Chromium instance and opens a new page before closing the browser. Great, you are ready to add the browser automation logic!
Step #4: Implement the Browser Automation Logic
The target site will be bot.sannysoft.com, a special web page that runs some tests in the browser to figure out whether the user is a human or a bot. If you try to visit this page on your local browser, you should see that all the tests are passed.
Connect to the target page using the goto()
method:
await page.goto("https://bot.sannysoft.com/")
Then, perform a screenshot of the entire page to see the results of the anti-bot tests:
await page.screenshot("results.png")
Put it all together, and you will get the following script.js
file:
import { chromium } from "playwright-extra"
(async () => {
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// navigate to the target page
await page.goto("https://bot.sannysoft.com/")
// take a screenshot of the entire page
await page.screenshot({
path: "results.png",
fullPage: true
})
// close the browser and release its resources
await browser.close()
})()
Execute the above code with the command below:
node script.js
The script will open a Chromium instance in headless mode, visit the desired page, take a screenshot, and then close the browser. If you open the results.png
file that will appear in the project root folder at the end of script execution, you will see:
As you can tell, vanilla Playwright in headlles mode does not pass several tests. This is why WAFs show CAPTCHAs when interacting with pages in Playwright. The solution? The Stealth plugin!
Step #5: Install the Playwright Stealth Plugin
Playwright Stealth is a plugin for playwright-extra
to prevent bot detection. That plugin overrides several configurations to make the browser instance appear to be natural, as if it was not being controlled by Playwright. Specifically, this module modifies the browser properties to prevent all leaks that expose the browser as automated.
The Stealth plugin was originally developed for Puppeteer Extra, but it also works for Playwright Extra. Install it via the puppeteer-extra-plugin-stealth
npm package with this command:
npm i puppeteer-extra-plugin-stealth
Next, import the Stealth plugin in your script.js
file with this line:
import StealthPlugin from "puppeteer-extra-plugin-stealth"
Step #6: Register the Stealth Settings
To implement Playwright CAPCHA bypass logic, simply register the Stealth plugin in playwright-extra
through the use()
method:
chromium.use(StealthPlugin())
The browser controlled by Playwright will now appear as a real-world browser in use by a human user.
Step #7: Repeat the Bot Detection Test
Here is what your script.js
file should currently look like:
import { chromium } from "playwright-extra"
import StealthPlugin from "puppeteer-extra-plugin-stealth"
(async () => {
// register the Stealth plugin
chromium.use(StealthPlugin())
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// navigate to the target page
await page.goto("https://bot.sannysoft.com/")
// take a screenshot of the entire page
await page.screenshot({
path: "results.png",
fullPage: true
})
// close the browser and release its resources
await browser.close()
})()
Launch the script again:
node script.js
Open results.png
another time, and you will now see that all bot-detection tests have been passed:
Et voilà! The Playwright CAPTCHA bypass trick is complete!
What If the Above Playwright CAPTCHA Solver Solution Does Not Work?
Unfortunately, browser settings are not the only aspect that anti-bot tools focus their attention on. IP reputation is another key factor, and you cannot just change your exit IP with a free library. You need Playwright proxy integration for that!
Thus, CAPTCHAs may still appear even if you configure your browser optimally. For simple CAPTCHAs that require only a single click, you can use the puppeteer-extra-plugin-recaptcha
plugin. However, the approach based on plugins from the previous chapter works only against basic anti-bot measures. When dealing with more complex tools like Cloudflare, you need something more powerful.
Looking for a real Playwright CAPTCHA solver? Try Bright Data web scraping solutions!
These provide superior unlocking capabilities with a dedicated CAPTCHA-solving feature to automatically handle reCAPTCHA, hCaptcha, px_captcha, SimpleCaptcha, GeeTest CAPTCHA, FunCaptcha, Cloudflare Turnstile, AWS WAF Captcha, KeyCAPTCHA, and many others. Integrating Bright Data’s CAPTCHA Solver into your script is easy, as it works with any HTTP client or browser automation tool.
Find out more about how to use Bright Data’s CAPTCHA Solver. Also, check out the documentation for all integration and configuration details.
Conclusion
In this article, you learned why CAPTCHAs present a challenge for Playwright and how to address them. Using the Playwright Stealth library, you can override the default browser configuration to circumvent bot detection. Still, this approach may not always be enough.
No matter how sophisticated your Playwright script is, advanced bot detection systems can still identify you as a bot. The solution is to connect to your target page through an unlocking API that can seamlessly return the CAPTCHA-free HTML of any web page.
That API exists and is called Web Unlocker. Its goal is to automatically rotate the exit IP with each request via proxy integration, handle browser fingerprinting, automatic retries, and CAPTCHA resolution for you. Forget about anti-bot measures!
Register now and start your free trial today.
No credit card required