Simplify your dynamic scraping operations
const pw = require('playwright');
const SBR_CDP = 'wss://brd-customer-CUSTOMER_ID-zone-ZONE_NAME:[email protected]:9222';
async function main() {
console.log('Connecting to Scraping Browser...');
const browser = await pw.chromium.connectOverCDP(SBR_CDP);
try {
const page = await browser.newPage();
console.log('Connected! Navigating to https://example.com...');
await page.goto('https://example.com');
console.log('Navigated! Scraping page content...');
const html = await page.content();
console.log(html);
} finally {
await browser.close();
}
}
main().catch(err => {
console.error(err.stack || err);
process.exit(1);
});
import asyncio
from playwright.async_api import async_playwright
SBR_WS_CDP = 'wss://brd-customer-CUSTOMER_ID-zone-ZONE_NAME:[email protected]:9222'
async def run(pw):
print('Connecting to Scraping Browser...')
browser = await pw.chromium.connect_over_cdp(SBR_WS_CDP)
try:
page = await browser.new_page()
print('Connected! Navigating to https://example.com...')
await page.goto('https://example.com')
print('Navigated! Scraping page content...')
html = await page.content()
print(html)
finally:
await browser.close()
async def main():
async with async_playwright() as playwright:
await run(playwright)
if __name__ == '__main__':
asyncio.run(main())
const puppeteer = require('puppeteer-core');
const SBR_WS_ENDPOINT = 'wss://brd-customer-CUSTOMER_ID-zone-ZONE_NAME:[email protected]:9222';
async function main() {
console.log('Connecting to Scraping Browser...');
const browser = await puppeteer.connect({
browserWSEndpoint: SBR_WS_ENDPOINT,
});
try {
const page = await browser.newPage();
console.log('Connected! Navigating to https://example.com...');
await page.goto('https://example.com');
console.log('Navigated! Scraping page content...');
const html = await page.content();
console.log(html)
} finally {
await browser.close();
}
}
main().catch(err => {
console.error(err.stack || err);
process.exit(1);
});
const { Builder, Browser } = require('selenium-webdriver');
const SBR_WEBDRIVER = 'https://brd-customer-CUSTOMER_ID-zone-ZONE_NAME:[email protected]:9515';
async function main() {
console.log('Connecting to Scraping Browser...');
const driver = await new Builder()
.forBrowser(Browser.CHROME)
.usingServer(SBR_WEBDRIVER)
.build();
try {
console.log('Connected! Navigating to https://example.com...');
await driver.get('https://example.com');
console.log('Navigated! Scraping page content...');
const html = await driver.getPageSource();
console.log(html);
} finally {
driver.quit();
}
}
main().catch(err => {
console.error(err.stack || err);
process.exit(1);
});
from selenium.webdriver import Remote, ChromeOptions
from selenium.webdriver.chromium.remote_connection import ChromiumRemoteConnection
SBR_WEBDRIVER = 'https://brd-customer-CUSTOMER_ID-zone-ZONE_NAME:[email protected]:9515'
def main():
print('Connecting to Scraping Browser...')
sbr_connection = ChromiumRemoteConnection(SBR_WEBDRIVER, 'goog', 'chrome')
with Remote(sbr_connection, options=ChromeOptions()) as driver:
print('Connected! Navigating to https://example.com...')
driver.get('https://example.com')
print('Navigated! Scraping page content...')
html = driver.page_source
print(html)
if __name__ == '__main__':
main()
Cloud-based dynamic scraping
- Run your Puppeteer, Selenium or Playwright scripts
- Automated proxy management and web unlocking
- Troubleshoot and monitor using Chrome DevTools
- Fully-hosted browsers, optimized for scraping
Benefits of Scraping Browser
Cut infrastructure overheads
Set-up and auto-scale browser environment via a single API, offering unlimited concurrent sessions and workloads for continuous scraping
Increase success rates
Stop building unlocking patches and future-proof access to any public web data through built-in unlocker and a hyper-extensive residential IP pool
Boost developer productivity
Make your devs ‘laser-focused’ on what matters by running your existing scripts in a hybrid cloud with just one line of code, freeing them from the hassle of scraping operations
Auto-scale browser infrastructure
Connect your interactive, multi-step scraping scripts into a hybrid browser environment, offering unlimited concurrent sessions using a single line of code
Chrome DevTools compatible
Use Chrome DevTools debugger to seamlessly monitor and troubleshoot your Scraping Browser performance
Tap into autonomous unlocking
Browser Fingerprinting
Emulate real users' browsers to simulate a human experience
CAPTCHA Solving
Analyze and solve CAPTCHAs and challenge-response tests
Manage Specific User Agents
Automatically mimic different types of browsers and devices
Set Referral Headers
Simulate traffic originating from popular or trusted websites
Handle Cookies
Prevent potential blocks imposed by cookie-related factors
Automatic Retries and IP Rotation
Continually retry requests, and rotate IPs, in the background
Worldwide Geo-Coverage
Access localized content from any country, city, state or ASN
JavaScript Rendering
Extract data from websites that rely on dynamic elements
Data Integrity Validations
Ensure the accuracy, consistency and reliability of data
Hyper-extensive pool of real IPs
Access the web as a real user using 72M+ ethically-sourced residential IPs, 195 country coverage, and APIs for advanced configuration and management
Scraping Browser Pricing
Pay with AWS Marketplace
Streamline payments with the AWS Marketplace, enhancing procurement and billing efficiency. Use existing AWS commitments and benefit from AWS promotions
24/7 support
Get round-the-clock expert support, resolve issues quickly, and assure quality data delivery. Gain real-time visibility into network status for full transparency
FAQ
What is Scraping Browser?
Scraping Browser works like other automated browsers and is controlled by common high-level APIs like Puppeteer and Playwright, but is the only browser with built-in website unblocking capabilities. Scraping Browser automatically manages all website unlocking operations under the hood, including: CAPTCHA solving, browser fingerprinting, automatic retries, selecting headers, cookies, & Javascript rendering, and more, so you can save time and resources.
When do I need to use a browser for scraping?
When data scraping, developers use automated browsers when JavaScript rendering of a page or interactions with a website are needed (hovering, changing pages, clicking, screenshots, etc.). In addition, browsers are useful for large-scaling data scraping projects when multiple pages are targeted at once.
Is Scraping Browser a headless browser or a headfull browser?
Scraping Browser is a GUI browser (aka “headfull” browser) that uses a graphic user interface. However, a developer will experience Scraping Browser as headless, interacting with the browser through an API like Puppeteer or Playwright. Scraping Browser, however, is opened as a GUI Browser on Bright Data’s infrastructure.
What’s the difference between headfull & headless browsers for scraping?
In choosing an automated browser, developers can choose from a headless or a GUI/headful browser. The term “headless browser” refers to a web browser without a graphical user interface. When used with a proxy, headless browsers can be used to scrape data, but they are easily detected by bot-protection software, making large-scale data scraping difficult. GUI browsers, like Scraping Browser (aka “headfull”), use a graphical user interface. Bot detection software is less likely to detect GUI browsers.
Why is Scraping Browser better than Headless Chrome or Selenium web scraping Python?
Scraping Browser comes with a built-in website unlocking feature that handles blocking for you automatically. The Scraping Browsers employ automated unlocking and are opened on Bright Data’s servers, so they are ideal for scaling web data scraping projects without requiring extensive infrastructure.
Is the Scraping Browser compatible with Puppeteer scraping?
Yes, Scraping Browser is fully compatible with Puppeteer.
Is Playwright scraping compatible with the Scraping Browser?
Yes, Scraping Browser is fully compatible with Playwright.
When should I use Scraping Browser instead of other Bright Data proxy products?
Scraping Browser is an automated browser optimized for data scraping, which integrates the power of Web Unlocker’s automated unlocking capabilities. While Web Unlocker works with one-step requests, Scraping Browser is needed when a developer needs to interact with a website to retrieve its data. It is also ideal for any data scraping project that requires browsers, scaling, and automated management of all website unblocking actions.