Web Scraping With VBA Guide

Unlock the potential of Excel for web scraping with VBA, a powerful way to automate data collection directly within your spreadsheets.
13 min read
Web Scraping with Excel VBA main blog image

In this Excel scraping guide, you will see:

  • What VBA is.
  • Whether a custom VBA module allows you to retrieve from the Web.
  • How to deal with web scraping in Excel after Internet Explorer end of support.
  • How to write VBA code for web scraping using Selenium.
  • The legacy approach to online data retrieval in Excel.

Let’s dive in!

What Is VBA?

VBA, short for Visual Basic for Applications, is a programming language developed by Microsoft. Its goal is to enable users to automate tasks and write custom functions in various Microsoft Office applications, including Excel, Word, and Access.

In Excel, VBA can be used to define complex macros that interact with data in spreadsheet cells. With VBA, you can improve productivity, simplify workflows, and extend Excel’s capabilities beyond its built-in functions.

Is It Possible to Perform Excel Web Scraping With VBA?

Yes, VBA opens the doors to web scraping directly in Excel. Thanks to VBA, you can write custom modules that:

  1. Connect to a web page in a browser.
  2. Parse its HTML content.
  3. Extract data from it.
  4. Write the scraped data directly within Excel cells.

This is what web scraping is all about. Thus, scraping in VBA is not only possible but also efficient because it directly imports data into Excel. Learn more about the pros and cons of scraping in Excel in the section!

Pros and Cons of Excel Web Scraping

Before digging into how to write VBA code for web scraping, let’s see some advantages and disadvantages of web scraping in Excel with VBA.

👍 Pros

  • Access to Excel features: Scraped data is directly imported into Excel, enabling you to use Excel powerful features for data manipulation and analysis.
  • Ready to go: Microsoft Office comes with VBA support out of the box. With just Microsoft Office 365 installed on your PC, you have everything you need to do web scraping.
  • All-in-one automation: With a VBA web scraping script, you can automate a complete data collection task, from data retrieval to data representation in Excel.

👎 Cons

  • Some features are only available on Windows: VBA packages for Active-X controls and COM automation are only available on Windows and do not work on Office for Mac.
  • VBA feels old: Visual Basic is not the most modern programming language around. If you have never worked with it, you may find it difficult to use due to its old syntax and approach to coding.
  • Internet Explorer is now deprecated: VBA COM automation to control a web browser is based on Internet Explorer, which is now deprecated.

The last pitfall is quite major, so it deserves a more in-depth analysis.

How to Deal With Internet Explorer Deprecation in VBA Web Scraping?

The legacy approach to writing VBA code for web scraping used to rely on the COM automation interface coming with the InternetExplorer object. This provided everything you needed to visit a site in Internet Explorer, parse its HTML content, and extract data from it. The problem is that Internet Explorer is no longer supported from June 15, 2022.

In other words, most recent versions of Windows do not even come with Internet Explorer. As a result, using the InternetExplorer object in a VBA script will result in an error. Since Edge is the replacement for Internet Explorer, you might be thinking about using the equivalent object in VBA. However, Edge does not come with the COM automation interface. Thus, you cannot control it programmatically as you did with Internet Explorer.

Instead, Edge supports automation via a web driver that can be controlled through browser automation technologies such as Selenium. So, the currently supported way to perform Excel VBA web scraping is using the Selenium binding for VBA. This allows you to control a browser like Chrome, Edge, or Firefox.

In the section below, you will build an Excel web scraping script using Selenium and Edge. Later, you will also see the code snippet of the legacy approach, which does not require any third-party dependencies.

How To Write VBA Code for Web Scraping Using Selenium

In this tutorial section, you will learn how to perform Excel web scraping in VBA using SeleniumBasic, the Selenium API binding for VBA.

The target site will be the Scrape This Site Country sandbox, which contains a list of all the countries in the world:

Scrape this site country sandbox page

The goal of the VBA scraper will be to automatically retrieve this data and write it into an Excel spreadsheet.

Time to write some VBA code!

Prerequisites

Make sure you have the latest version of Microsoft Office 365 installed on your machine. This section refers to Windows 11 and the Office 2024 update. At the same time, the steps below will be the same or similar for macOS and other versions of Office.

Note that the desktop version of Office is required to follow the tutorial. The free Microsoft 365 web platform does not support VBA scripting.

Step #1: Install and Set Up SeleniumBasic

Download the SeleniumBasic installer from the release page of the GitHub repository:

downloading selenium basic

Double-click on the .exe installer and wait for the installation process to complete.

As with most VBA packages, SeleniumBasic has not received an update in years. Because of that, it comes with web drivers that no longer work with recent browsers. To fix that, you need to manually override the executable files of the drivers in the SeleniumBasic installation folder.

Here, you will see how to override the Edge web driver, but you can follow the same procedure for Chrome and Firefox.

First, download the latest Stable version of the Microsoft Edge WebDriver:

downloading the latest stable version of the edge webdriver

The “x64” version should be what you are looking for.

You should now have an edgedriver_win64.zip file. Unzip it, enter the unzipped folder, and make sure it contains a msedgedriver.exe executable. That is the Edge WebDriver executable.

Rename it to “edgedriver.exe” and get ready to place it in the right folder.

Open the SeleniumBasic installation folder you should find at:

C:\Users\<YOUR_USERNAME>\AppData\Local\SeleniumBasic

Place edgedriver.exe inside this folder, overriding the existing Edge WebDriver executable.

Wonderful! SeleniumBasic will now be able to control the latest version of Edge in Excel.

Step #2: Launch Excel

Open the Windows Start Menu, type “Excel,” and click on the “Excel” app. Select the “Blank workbook” option to create a new spreadsheet:

Opening a new sheet in Excel

At the end of this section, this will contain the scraped data.

Step #3: Enable the Developer Tab

If you look at the tab bar on the top, you will not see any option to create a VBA script. That is because you must first enable that in the Excel configurations.

To do so, click on “File” in the top left corner:

clicking on file in the top corner

Then, select “Options” as below:

selecting options

In the “Options” popup window, reach the “Custom Ribbon” tab, and check the “Developer” option in the “Main tabs” section:

checking the develpoer option under custom ribbon

Press “OK” and a new “Developer” tab will appear:

Step #4: Initialize a VBA Web Scraping Module

Click on the “Developer” tab and press the “Visual Basic” button:

under the developer tab, clicking on visual basic

This will open the following window:

the newly opened window with a grey background

Here, click “Insert” on the top menu and then “Module” to initialize your VBA scraping module:

clicking on module under the insert option

This is what you should be seeing right now:

the new module opened

The “Book1 – Module1 (Code)” inner window is where you should write your VBA code for web scraping.

Step #5: Import Seleniumbasic

In the top menu, click “Tools” and then “References…”

clicking on references under tools

In the popup window, find the “Selenium Type Library” and check it:

checking selenium type library form the options

Click the “OK” button, and you will now be able to use Selenium in Excel to perform web scraping.

Step #6: Automate Edge to Open the Target Site:

Paste the code below in the VBA module window:

Sub scrape_countries()

' initialize a Selenium WebDriver instance

Dim driver As New WebDriver

' open a new Edge window

driver.Start "Edge"

' navigate to the target page

driver.Get "https://www.scrapethissite.com/pages/simple/"

' wait 10 seconds before shutting down the application

Application.Wait Now + TimeValue("00:00:10")

' close the current driver window

driver.Quit

End Sub

This initializes a Selenium instance and uses it to instruct Edge to visit the target page. Test the code by clicking on the run button:

click the run button to test the code

That will open the following Edge window:

the Edge window that opened

Note the “Microsoft Edge is being controlled by automated test software.” message which informs that Selenium is operating on Edge as desired.

If you do not want Edge to show up, you can enable the headless mode with this line:

driver.SetCapability "ms:edgeOptions", "{""args"":[""--headless""]}"

Step #7: Inspect the HTML Code of the Page

Web scraping involves selecting HTML elements on a page and collecting data from them. CSS selectors are among the most popular methods for selecting HTML nodes. If you are a web developer, you should already be familiar with them. Otherwise, explore the official documentation.

To define effective CSS selectors, you must first get familiar with the HTML of the target page. So, open the Scrape This Site Country sandbox in the browser, right-click on a country element, and select the “Inspect” option:

using the inspect option

Here, you can see that each country HTML element is <div> you can select with the following CSS selector:

.country

Given a .country HTML node, you should target:

  • The country name in the .country-name element.
  • The name of the capital in the .country-capital element.
  • The population information in the .country-population element.
  • The area in km² occupied by the country in the .country-area element.

Those are all the CSS selectors you need to select the desired HTML nodes and extract data from them. See how to use them in the next step!

Step #8: Write the VBA Web Scraping Logic

Use the FindElementsByCss() method exposed by driver to apply the.country CSS selector and select all country HTML nodes on the page:

Dim countryHTMLElements As WebElements

Set countryHTMLElements = driver.FindElementsByCss(".country")

Then, define an integer counter to keep track of the current Excel row to write data in:

Dim currentRow As Integer

currentRow = 1

Next, iterate over the country HTML nodes, extract the desired data from them, and write it to Excel cells using the Cells() function:

For Each countryHTMLElement In countryHTMLElements

' where to store the scraped data

Dim name, capital, population, area As String

' data retrieval logic

name = countryHTMLElement.FindElementByCss(".country-name").Text

capital = countryHTMLElement.FindElementByCss(".country-capital").Text

population = countryHTMLElement.FindElementByCss(".country-population").Text

area = countryHTMLElement.FindElementByCss(".country-area").Text

' write the scraped data in Excel cells

Cells(currentRow, 1).Value = name

Cells(currentRow, 2).Value = capital

Cells(currentRow, 3).Value = population

Cells(currentRow, 4).Value = area

' increment the row counter

currentRow = currentRow + 1

Next countryHTMLElement

Amazing! You are ready to take a look at the final Excel web scraping code.

Step #9: Put It All Together

Your VBA web scraping module should now contain:

Sub scrape_countries()

' initialize a Selenium WebDriver instance

Dim driver As New WebDriver

' enable the "headless" mode

driver.SetCapability "ms:edgeOptions", "{""args"":[""--headless""]}"

' open a new Edge window

driver.Start "Edge"

' navigate to the target page

driver.Get "https://www.scrapethissite.com/pages/simple/"

' select all country HTML nodes on the page

Dim countryHTMLElements As WebElements

Set countryHTMLElements = driver.FindElementsByCss(".country")

' counter to the current row

Dim currentRow As Integer

currentRow = 1

' iterate over each country HTML node and

' apply the Excel scraping logic

For Each countryHTMLElement In countryHTMLElements

' where to store the scraped data

Dim name, capital, population, area As String

' data retrieval logic

name = countryHTMLElement.FindElementByCss(".country-name").Text

capital = countryHTMLElement.FindElementByCss(".country-capital").Text

population = countryHTMLElement.FindElementByCss(".country-population").Text

area = countryHTMLElement.FindElementByCss(".country-area").Text

' write the scraped data in Excel cells

Cells(currentRow, 1).Value = name

Cells(currentRow, 2).Value = capital

Cells(currentRow, 3).Value = population

Cells(currentRow, 4).Value = area

' increment the row counter

currentRow = currentRow + 1

Next countryHTMLElement

' close the current driver window

driver.Quit

End Sub

Launch it and wait for the module execution to complete. At the end of the VBA script, the Excel spreadsheet will contain:

the Excel sheet containing data

This file contains the same data as the target site but in a semi-structured format. Analyzing and filtering this data will now be much easier thanks to the capabilities offered by Excel.

Et voilà! In less than 100 VBA lines of code, you just performed Excel web scraping!

Find out more in our guide on how web scraping works in Excel.

Legacy Approach to Web Scraping in VBA with Internet Explorer

If you are using an older version of Windows, you can perform web scraping in VBA with Internet Explorer.

All you have to do is enable the “Microsoft HTML Object Library” and “Microsoft Internet Controls” references:

enabling the microsoft html object library and microsoft internet controls

Note that these two packages are built-in in Excel, so you do not need to install extra libraries this time.

You can then achieve the same result as before with the following VBA code for web scraping:

Sub scrape_countries()

' start Internet Explorer

Dim browser As InternetExplorer

Set browser = New InternetExplorer

' enable the "headless" mode

browser.Visible = False

'visit the target page

browser.navigate "https://www.scrapethissite.com/pages/simple/"

' wait for the browser to load the page

Do: DoEvents: Loop Until browser.readyState = 4

' get the current page

Dim page As HTMLDocument

Set page = browser.document

' retrieve all country HTML nodes on the page

Dim countryHTMLNodes As Object

Set countryHTMLElements = page.getElementsByClassName("country")

' counter to the current row

Dim currentRow As Integer

currentRow = 1

' iterate over each country HTML node and

' apply the Excel scraping logic

For Each countryHTMLElement In countryHTMLElements

' where to store the scraped data

Dim name, capital, population, area As String

' data retrieval logic

name = countryHTMLElement.getElementsByClassName("country-name")(0).innerText

capital = countryHTMLElement.getElementsByClassName("country-capital")(0).innerText

population = countryHTMLElement.getElementsByClassName("country-population")(0).innerText

area = countryHTMLElement.getElementsByClassName("country-area")(0).innerText

' write the scraped data in Excel cells

Cells(currentRow, 1).Value = name

Cells(currentRow, 2).Value = capital

Cells(currentRow, 3).Value = population

Cells(currentRow, 4).Value = area

' increment the row counter

currentRow = currentRow + 1

Next countryHTMLElement

' close the current Internext Explorer window

browser.Quit

End Sub

Run this VBA module, and you will get the same result as before. Fantastic! You just did Excel web scraping with Internet Explorer.

Conclusion

In this guide, you learned what VBA is and why it allows you to do web scraping in Excel. The problem is that the VBA browser automation library relies on Internet Explorer, which is no longer supported. Here, you explored an equivalent approach to automate data retrieval in Excel with VBA using Selenium. Also, you saw the legacy approach for older versions of Windows.

At the same time, keep in mind that there are many challenges to take into account when extracting data from the Internet. In particular, most sites adopt anti-scraping and anti-bot solutions that can detect and block your VBA web scraping script. Avoid them all with our Scraping Browser solution. This next-generation browser integrates with Selenium and can automatically handle CAPTCHA resolution, browser fingerprinting, automated retries, and more for you!

Don’t want to deal with web scraping at all but are interested in Excel data? Explore our ready-to-use datasets. Not sure what data solution to choose? Contact us today!