Guide to eCommerce Web Scraping with Python

In this tutorial, you will explore:

The definition of ecommerce scraping and why it is useful
The types of ecommerce scraper tools
The data you can scrape from ecommerce platforms
How to create an ecommerce scraping script with Python
The challenges of scraping ecommerce websites

Let’s dive in!

What Is eCommerce Web Scraping?

Ecommerce web scraping is the process of extracting data from online retail platforms like Amazon, Walmart, eBay, and similar sites. While it can be done by manually copying the data, it is usually performed using automated tools or scripts.

The data extracted from ecommerce sites can help businesses, researchers, and developers:

Analyze product price fluctuations
Track review scores
Identify market trends
Study competitors

These insights enable informed decision-making and strategic planning.

Note that an ecommerce data scraping tool is commonly referred to as an ecommerce scraper.

Types of eCommerce Scrapers

Below is a list of some of the most popular types of ecommerce scraper tools:

Custom scripts: Tailored scripts to extract specific ecommerce data using web scraping programming languages like Python or JavaScript.
No-code scrapers: User-friendly tools allowing data extraction without coding, ideal for non-technical users. Discover the best no-code scrapers.
Web scraping APIs: Interfaces that provide structured ecommerce data programmatically, often supporting real-time or large-scale extraction.
Scraping extensions: Browser-based add-ons that simplify data collection directly from ecommerce web pages as you navigate them.

In this article, we will focus specifically on building a custom ecommerce web scraping bot.

Data to Scrape from eCommerce Sites

ecommerce web scrapers typically help you retrieve the following data:

Product details: Names, descriptions, specifications, and images.
Pricing information: Current prices, discounts, and historical price trends.
Customer reviews: Ratings, review content, and customer feedback.
Categories and tags: Classification and categorization of products.
Seller information: Seller names, ratings, and contact details.
Shipping details: Costs, delivery times, and shipping policies.
Stock availability: Inventory levels and out-of-stock notifications.
Marketing data: Product listings, pricing strategies, promotions, and seasonal discounts.

Now, learn how to build a Python ecommerce scraper!

How To Build an eCommerce Scraper

To manually build an ecommerce scraper, you first need to familiarize yourself with the target site. Inspect the target page with the DevTools to:

Understand its structure
Determine what data you can extract
Decide which scraping libraries to use

For simpler ecommerce sites, the following two Python libraries are sufficient:

Requests: For sending HTTP requests. It helps you get the raw HTML content of a webpage.
Beautiful Soup: For parsing HTML and XML documents. It simplifies navigation and data extraction from a page’s HTML structure. Learn more in our guide on Beautiful Soup scraping.

You can install them both with:

pip install requests beautifulsoup4

For eCommerce platforms that load data dynamically or rely heavily on JavaScript rendering, you will need browser automation tools like Selenium. For more information, see our tutorial on Selenium scraping.

You can install Selenium with:

pip install selenium

Next, the web scraping process is as follows:

Connect to the target site: Use Requests or Selenium to retrieve and parse the HTML of the page.
Select the elements of interest: Locate specific elements (e.g., product image, price, description) in the HTML structure and select them with CSS selectors or XPath expressions.
Extract data: Pull the desired information from these HTML elements.
Clean the data: Process the extracted data to remove unnecessary content or reformat it, if needed.
Export the data: Save the cleaned data in a preferred format, such as JSON or CSV.

The advantages of this approach include having full control over the data extraction process and the ability to customize it to meet specific requirements. However, it does require technical expertise for design and maintenance,. Plus, each ecommerce site necessitates its own script.

In the next chapters, you will find Python eCommerce scraping script examples for extracting data from Amazon, Walmart, and eBay!

Amazon Scraping

Target page: “laptop” search page on Amazon
Target page URL: https://www.amazon.com/s?k=laptop&ref=nb_sb_noss

Amazon has anti-scraping measures designed to block requests that do not originate from a browser. To bypass these restrictions, you need to use a browser automation tool like Selenium:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
import json

# Initialize the WebDriver
driver = webdriver.Chrome(service=Service())

# Open the Amazon home page in the browser
driver.get("https://amazon.com/")

# Fill out the search form
search_input_element = driver.find_element(By.ID, "twotabsearchtextbox")
search_input_element.send_keys("laptop")

# Locate the search button and click it
search_button_element = driver.find_element(By.ID, "nav-search-submit-button")
search_button_element.click()

# You are now on the target page

# Where to store the scraped data
products = []

# Select all product elements on the page
product_elements = driver.find_elements(By.CSS_SELECTOR, "[role="listitem"][data-asin]")
# Iterate over them
for product_element in product_elements:
    # Scraping logic
    url_element = product_element.find_element(By.CSS_SELECTOR, ".a-link-normal")
    url = url_element.get_attribute("href")

    name_element = product_element.find_element(By.CSS_SELECTOR, "h2")
    name = name_element.text

    image_element = product_element.find_element(By.CSS_SELECTOR, "img[data-image-load]")
    image = image_element.get_attribute("src")

    # Populate a new object with the scraped data
    product = {
      "url": url,
      "name": name,
      "image": image
    }
    # Add it to the list of scraped products
    products.append(product)

# Export data to a JSON file
with open("products.json", "w", encoding="utf-8") as json_file:
    json.dump(products, json_file, indent=4)

Run the above Amazon eCommerce scraper, and if Amazon does not show a CAPTCHA, it will generate the following result:

[
    {
        "url": "https://www.amazon.com/A315-24P-R7VH-Display-Quad-Core-Processor-Graphics/dp/B0BS4BP8FB/ref=sr_1_3?crid=1W7R6D59KV9L1&dib=eyJ2IjoiMSJ9.iBCtzwnCm6CE8Bx8hKmQ8ez6PkzMg3asWNhAxvflBg3pKVi5IxQUSDpcaksihO-jEO1nyLGkdoGk_2hNyQ7EWOa6epS_hZHxqV7msqdtcEZv4irFZRnYHcP5YnEwKu17BjsYS_IPI1tFVDS65v_roSCu_IiBNfotAEHSx4zOwQ4u1CRKfvnLjIX4VlECydRjsKaAQ-mErT89tyBUCfEGjzKPPZxwHi3Y0MoieuPceL8.jIuIrqzxNYISYPLHifRJq289Vy9Z6hqT8vmMcUQw9HY&dib_tag=se&keywords=laptop&qid=1735572968&sprefix=l%2Caps%2C271&sr=8-3",
        "name": "Acer Aspire 3 A315-24P-R7VH Slim Laptop | 15.6" Full HD IPS Display | AMD Ryzen 3 7320U Quad-Core Processor | AMD Radeon Graphics | 8GB LPDDR5 | 128GB NVMe SSD | Wi-Fi 6 | Windows 11 Home in S Mode",
        "image": "https://m.media-amazon.com/images/I/61gKkYQn6lL._AC_UY218_.jpg"
    },
    // omitted for brevity...
    {
        "url": "https://www.amazon.com/Lenovo-Newest-Flagship-Chromebook-HubxcelAccesory/dp/B0CBJ46QZX/ref=sr_1_8?crid=1W7R6D59KV9L1&dib=eyJ2IjoiMSJ9.iBCtzwnCm6CE8Bx8hKmQ8ez6PkzMg3asWNhAxvflBg3pKVi5IxQUSDpcaksihO-jEO1nyLGkdoGk_2hNyQ7EWOa6epS_hZHxqV7msqdtcEZv4irFZRnYHcP5YnEwKu17BjsYS_IPI1tFVDS65v_roSCu_IiBNfotAEHSx4zOwQ4u1CRKfvnLjIX4VlECydRjsKaAQ-mErT89tyBUCfEGjzKPPZxwHi3Y0MoieuPceL8.jIuIrqzxNYISYPLHifRJq289Vy9Z6hqT8vmMcUQw9HY&dib_tag=se&keywords=laptop&qid=1735572968&sprefix=l%2Caps%2C271&sr=8-8",
        "name": "Lenovo Newest Flagship Chromebook, 14'' FHD Touchscreen Slim Thin Light Laptop Computer, 8-Core MediaTek Kompanio 520 Processor, 4GB RAM, 64GB eMMC, WiFi 6,Chrome OS+HubxcelAccesory, Abyss Blue",
        "image": "https://m.media-amazon.com/images/I/61KlKRdsQ7L._AC_UY218_.jpg"
    }
]

Note that Amazon may still show a CAPTCHA and block your request, even if you are making it through Selenium. In that case, you should check out SeleniumBase as an alternative. Otherwise, keep reading the article as we will present a definitive solution.

For a comprehensive walkthrough, check out our detailed tutorial on Amazon web scraping.

Walmart Scraping

Target page: “keyboard” search page on Walmart
Target page URL: https://www.walmart.com/search?q=keyboard

Walmart keyboard and electronics search results

Just like Amazon, Walmart uses anti-bot solutions to block requests that come from automated HTTP clients. So, you can scrape it with Selenium as below:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
import json

# Initialize the WebDriver
driver = webdriver.Chrome(service=Service())

# Navigate to the target page
driver.get("https://www.walmart.com/search?q=keyboard")

# Where to store the scraped data
products = []

# Select all product elements on the page
product_elements = driver.find_elements(By.CSS_SELECTOR, ".carousel-4[data-testid="carousel-container"] li")
# Iterate over them
for product_element in product_elements:
    # Scraping logic
    url_element = product_element.find_element(By.CSS_SELECTOR, "a")
    url = url_element.get_attribute("href")

    name_element = product_element.find_element(By.CSS_SELECTOR, "h3")
    name = name_element.get_attribute("innerText")

    image_element = product_element.find_element(By.CSS_SELECTOR, "img[data-testid="productTileImage"]")
    image = image_element.get_attribute("src")

    # Populate a new object with the scraped data
    product = {
      "url": url,
      "name": name,
      "image": image
    }
    # Add it to the list of scraped products
    products.append(product)

# Export data to a JSON file
with open("products.json", "w", encoding="utf-8") as json_file:
    json.dump(products, json_file, indent=4)

Execute the Walmart ecommerce scraper, and you will get:

[
    {
        "url": "https://www.walmart.com/sp/track?bt=1&eventST=click&plmt=sp-search-middle~desktop~Results%20for%20%22Electronics%22&pos=1&tax=3944_1089430_132959_1008621_7197407&rdf=1&rd=https%3A%2F%2Fwww.walmart.com%2Fip%2FLogitech-920-004536-Mk270-Keyboard-Mouse-USB-Wireless-Combo-Black%2F28540111%3FclassType%3DREGULAR%26adsRedirect%3Dtrue&adUid=094fb4ae-62f3-4954-ae99-b2938550d72c&mloc=sp-search-middle&pltfm=desktop&pgId=keyboard&pt=search&spQs=sAX_0l4wzWXzBji34bVpmheXU7_ETXGbDXcA9LhcshG_YbqBx24VWzt7yesHivpt1lpckuNhxQqbLidA-d8L4agqx_YPQVlj2EfM_TnEyfsSWiTEkvBaqgkaMzy6bgIZ4eC8t9-qqz7qtb7uXMz3cH92UCf5EEgQlfKwnxJ-SAF1EW1ouCjC10Ur3hELs3143xQPjxNUSUoN8FIF12fxJmTlSlTe4makoj1s2NoubYTqnlJLs3pohowJCRFT76Vl&storeId=3081&couponState=na&bkt=ace1_default%7Cace2_default%7Cace3_default%7Ccoldstart_off%7Csearch_default&classType=REGULAR",
        "name": "Logitech Wireless Combo MK270",
        "image": "https://i5.walmartimages.com/seo/Logitech-920-004536-Mk270-Keyboard-Mouse-USB-Wireless-Combo-Black_99591453-341e-4c5b-937e-b2ab9b321519.3860011d84a23ccd0732e46474590b15.jpeg?odnHeight=784&odnWidth=580&odnBg=FFFFFF"
    },
    {
        "url": "https://www.walmart.com/sp/track?bt=1&eventST=click&plmt=sp-search-middle~desktop~Results%20for%20%22Electronics%22&pos=2&tax=3944_1089430_132959_1008621_7197407&rdf=1&rd=https%3A%2F%2Fwww.walmart.com%2Fip%2FSteelSeries-Apex-3-TKL-RGB-Gaming-Keyboard-Tenkeyless-Water-Dust-Resistant-PC-and-USB-A%2F996783321%3FclassType%3DVARIANT%26adsRedirect%3Dtrue&adUid=094fb4ae-62f3-4954-ae99-b2938550d72c&mloc=sp-search-middle&pltfm=desktop&pgId=keyboard&pt=search&spQs=Dp3ons-xIcmPw9Ze7UUZuW3PD9Dto_vYCLjglme5vSy5Ze1p4NXg3uzApRy4mgfB-dGDchsq6FDoaZeMy6Dmeagqx_YPQVlj2EfM_TnEyfv_0r9GA9WwEd1cWbcx63Diahe72Zw6lw8suSf-OFKKH6UaiJl_8Qtpar-x0VhgrMsbqG7gDKh5DkQZql3HeMLncWSwburhSEjvpT1dXlDoWKxUrZwxZhOMry-uCqhuSb7Y6B-xZGrNPjYyel0nw11Z&storeId=3081&couponState=na&bkt=ace1_default%7Cace2_default%7Cace3_default%7Ccoldstart_off%7Csearch_default&classType=VARIANT",
        "name": "SteelSeries Apex 3 TKL RGB Gaming Keyboard - Tenkeyless - Water & Dust Resistant - PC and USB-A",
        "image": "https://i5.walmartimages.com/seo/SteelSeries-Apex-3-TKL-RGB-Gaming-Keyboard-Tenkeyless-Water-Dust-Resistant-PC-and-USB-A_876430c2-eed8-404a-aa55-1c66193daf8e.8c617e57ba48bc49d003f917f85cb535.jpeg?odnHeight=784&odnWidth=580&odnBg=FFFFFF"
    },
    // omittd for brevity...
    {
        "url": "https://www.walmart.com/ip/DEP-06-Portable-Digital-Piano-with-X-Stand/7598762909?classType=REGULAR",
        "name": "Donner Portable Digital Piano 88-key Synth Action Keyboard with X Stand, Pedal, Auto-accompaniment for Beginner, 128 Tones, 83 Rhythms, Support USB/MIDI/Melodics, Wireless Connection",
        "image": "https://i5.walmartimages.com/seo/DEP-06-Portable-Digital-Piano-with-X-Stand_1175fc1e-c191-4c71-9e9a-7e4a13274487.6673e0430c23d122744cfb63ccc8c155.jpeg?odnHeight=784&odnWidth=580&odnBg=FFFFFF"
    }
]

For more guidance, read our article on Walmart web scraping.

eBay Scraping

Target page: “mouse” search page on eBay
Target page URL: https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=mouse&_sacat=0

eBay does not use JavaScript for rendering products or loading data dynamically. Thus, it can be scraped with Requests and Beautiful Soup as follows:

import requests
from bs4 import BeautifulSoup
import json

# Target page
url = "https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=mouse&_sacat=0"

# Send a GET request to the eBay search page
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
}
response = requests.get(url, headers=headers)

# Parse the page content with BeautifulSoup
soup = BeautifulSoup(response.text, "html.parser")

# Where to store the scraped data
products = []

# Select all product elements on the page
product_elements = soup.select("li.s-item")
# Iterate over them
for product_element in product_elements:
    # Scraping logic
    url_element = product_element.select("a[data-interactions]")[0]
    url = url_element["href"]

    name_element = product_element.select("[role="heading"]")[0]
    name = name_element.text

    image_element = product_element.select("img")[0]
    image = image_element["src"]

    # Populate a new object with the scraped data
    product = {
      "url": url,
      "name": name,
      "image": image
    }
    # Add it to the list of scraped products
    products.append(product)

# Export data to a JSON file
with open("products.json", "w", encoding="utf-8") as json_file:
    json.dump(products, json_file, indent=4)

Launch the eBay ecommerce web scraping script, and it will produce:

[
    {
        "url": "https://www.ebay.com/itm/193168148815?_skw=mouse&itmmeta=01JGC679WKT327K11R9YCGMQAN&hash=item2cf9b8094f:g:8F4AAOSw3B1drMr-&itmprp=enc%3AAQAJAAAAwHoV3kP08IDx%2BKZ9MfhVJKlr8NKoodwElhyHbl4CwcBMRqdGJme95%2F3tIll4uI7QYBk4%2BUBpwVvwiXdAl2%2BcILZ9axc%2BdHSZStWWMxWVyq4JdZ6r52PrRP2aS1jUoFoJ11vL4KyH2S8R5ha71xBtDFcGA2%2BtzhTzcR7J25kxuxbyd%2Frd4YnKbTPKwhn2Q0TP8qL30BJKcj4FnJYP0zhgO4WOGgOCHQhM21%2BanVk%2Fl0eg1H8mqCU91mkgKAt8KghFmw%3D%3D%7Ctkp%3ABlBMULSenYaDZQ",
        "name": "2.4GHz Wireless Optical Mouse Mice & USB Receiver For PC Laptop Computer DPI USA",
        "image": "https://i.ebayimg.com/images/g/8F4AAOSw3B1drMr-/s-l500.webp"
    },
    {
        "url": "https://www.ebay.com/itm/356159975164?_skw=mouse&itmmeta=01JGC679WKE9V782ZXT15SEPHP&hash=item52ecc9eefc:g:0ikAAOSwHStnD33Q&itmprp=enc%3AAQAJAAAAwHoV3kP08IDx%2BKZ9MfhVJKlZ7pO0lYrvftkZhnT7ja625fcsjcktK0eaub2HNzEgsmo3b2VehoA4tffYdt0xiTXwHb%2BzYU4NBZ5onBh68cyKWhhMJowbRvnCwuwy2IQIRlkeijpbRtJNJPuaaiDZdV0eabGGkps8433kCR6fcX1xEodUxujoeYUjp0VP81OWcl%2BbBGd70%2Fq45HC3SXg4k%2FlK0%2FqR80yJYexSEfzUq7%2BN3Sa6Y01uCo5XPWFLHzRoSw%3D%3D%7Ctkp%3ABlBMULSenYaDZQ",
        "name": "Ergonomics LED Screen Display Wireless Gaming Mouse Bluetooth 2.4G Wired support",
        "image": "https://i.ebayimg.com/images/g/0ikAAOSwHStnD33Q/s-l500.webp"
    },
    // omitted for brevity...
    {
        "url": "https://www.ebay.com/itm/116250548048?_skw=mouse&itmmeta=01JGC679WN076MJ17QJ9P4FA5J&hash=item1b11129750:g:gr8AAOSwsSFmkXG3&itmprp=enc%3AAQAJAAAAwHoV3kP08IDx%2BKZ9MfhVJKkArX38iC0VVXTpfv4BzqCegsh22yxmsDAwZAmd4RxM9JlEMfuVRoYGVZFVCeurJYwAjWd2YK3%2BNs6m5rQHZXISyWtev1lEvfVVKP4Rd5QeC2KzLgqXOvp1lWiK5b31kfujkmKjF%2BEaR1kplulwrgUvzMO%2F78F%2BFukgIAoL8dE4nRD9jo%2BieiAgIpLBUcs8AmCy5vk65gt1JGonUOncRksGYciF%2FJg6arB9%2FVOYYq7N8A%3D%3D%7Ctkp%3ABlBMULyenYaDZQ",
        "name": "Razer x Sanrio Kuromi DeathAdder Gaming Mouse and Mouse Pad Combo",
        "image": "https://i.ebayimg.com/images/g/gr8AAOSwsSFmkXG3/s-l500.webp"
    }
]

Amazing! You just saw some examples of Python ecomerce data scraping scripts!

Challenges in eCommerce Web Scraping and How to Overcome Them

In the examples above, we focused on extracting basic details like product name, URL, and image URL from a few ecommerce sites. While this simplicity made ecommerce scraping seem straightforward, the reality is far more complex for several reasons:

Dynamic page structures: eCommerce platforms frequently update their page designs, requiring constant script maintenance.
Diverse product pages: Different products may display varying sets of data and use entirely different layouts.
Dynamic pricing: Scraping accurate price data can be challenging due to temporary deals, discounts, or region-specific offers.

Additionally, major ecommerce sites like Amazon employ advanced anti-scraping measures, such as CAPTCHAs:

Or, similarly, JavaScript challenges:

To overcome these blocks, you can:

Learn advanced scraping techniques: Read our guide on bypassing CAPTCHA with Python and take a look at in-depth scraping tutorials for practical tips.
Use advanced automation tools: Utilize robust tools like Playwright Stealth for scraping sites with anti-bot mechanisms.

Still, the most efficient solution is to use a dedicated eCommerce Scraper API.

Bright Data’s eCommerce Scraper API is a reliable solution for extracting data from eCommerce platforms like Amazon, Target, Walmart, Lazada, Shein, Shopee, and more. Key benefits include:

Retrieve structured details such as product title, seller name, brand, description, reviews, initial price, currency, availability, categories, and more.
Eliminate concerns about managing servers, proxies, or avoiding website blocks.
Avoid interruptions from CAPTCHAs or JavaScript challenges.

Streamline your ecommerce scraping process today!

Conclusion

In this article, you learned what an ecommerce scraper is and the type of data it can extract from ecommerce web pages. No matter how sophisticated your ecommerce web scraping script is, most sites can still detect automated activity and block you.

The solution is a powerful eCommerce Scraper API specifically designed to retrieve ecommerce data reliably from various platforms. These APIs offer structured and comprehensive data, including:

Amazon Scraper API: Scrape Amazon and collect data such as title, seller name, brand, description, reviews, initial price, currency, availability, categories, ASIN, number of sellers, and much more.
eBay Scraper API: Collect data such as ASIN, seller name, merchant ID, URL, image URL, brand, product overview, description, sizes, colors, final price, and more.
Walmart Scraper API: Collect data such as URL, SKU, price, image URL, related pages, available for delivery and pickup, brand, category, product ID and description, and more.
Target Scraper API: Gather data such as URL, product ID, title, description, rating, review count, price, discount, currency, images, seller name, offers, shipping policy, and more.
Lazada Scraper API: Scrape data such as URL, title, rating, reviews, initial and final price, currency, image, seller name, product description, SKU, colors, promotions, brand, and more.
Shein Scraper API: Retrieve data such as product name, description, price, currency, color, in stock, size, review count, main image, country code, domain, and more.
Shopee Scraper API: Scrape data such as URL, ID, title, rating, reviews, price, currency, stock, favorite, image, shop URL, ratings, date joined, followers, sold, brand, and more.

For scraping data from specific products, consider our Web Scraper API. If building a scraper is not your thing, explore our ready-to-use ecommerce datasets.

Create a free Bright Data account today to try our scraper APIs or explore our datasets.

Start free trial

Start free with Google

Antonello Zanini

Technical Writer

5.5 years experience

Antonello Zanini is a technical writer, editor, and software engineer with 5M+ views. Expert in technical content strategy, web development, and project management.

Expertise

Web Development Web Scraping AI Integration

View all articles

A Complete Guide to Scraping eCommerce Websites

What Is eCommerce Web Scraping?

Types of eCommerce Scrapers

Data to Scrape from eCommerce Sites

How To Build an eCommerce Scraper

Amazon Scraping

Walmart Scraping

eBay Scraping

Challenges in eCommerce Web Scraping and How to Overcome Them

Conclusion

Antonello Zanini

Expertise

Dedicated Scraper APIs & No-Code Scrapers

Just want data? Skip scraping.

You might also be interested in

How to Perform Web Scraping in Agno With Bright Data

Best Web Scraping Methods for JavaScript-Heavy Sites

Crawl4AI vs Firecrawl: Detailed Comparison 2025