How Web Scraping in Microsoft Excel Works – A Detailed Guide
In this post we will discuss:
- Collecting ‘table data’ using Excel’s Web Query tool
- Web data analysis in Excel
- Automated data collection tools that output data in Excel
Collecting ‘table data’ using Excel’s Web Query tool
For example, collecting data with Excel is much simpler than scraping with Python. The method that we will be focusing on is perfect if you are targeting web data that is organized in rows and columns (i.e., a table).
Here is a step-by-step guide to help you collect target web data and import it directly into an Excel Workbook so that you can get started with sorting, filtering, and analyzing:
Step 1: Open a new workbook
Data points need to be imported into an empty workspace, so either open a completely new Workbook file in Excel or add a new ‘Worksheet’ at the bottom of your existing file under the ‘Sheets’ tab.
Step 2: Run a web data query
You can run a new web-based query by going to the ‘data’ tab at the top of your Microsoft Excel worksheet, hitting the ‘Get Data’ button on the left, then go ahead and hit ‘From Other Sources’, and finally click ‘From Web’:
Step 3: Add your target URL
A new web query dialogue box will open. Now insert the target URL that contains your target data in a table that you wish to collect. Now hit ‘Import’.
Important to note: Excel will automatically identify any tables that appear in your target URL. It will display a small yellow arrow next to the different tables on the website / in the dialogue box. Click the arrow next to the table you wish to collect data from, and it will turn into a green check mark. Only once you have completed this for all the tables you are interested in should you click ‘import’.
Step 4: Decide where data is imported to
Excel will now display the next dialogue box in the sequence, known as ’the Import Data dialogue box’. Now either choose your newly opened and saved Worksheet under the ‘Existing Worksheet’ option or choose to open an entirely ‘New worksheet’, then click ‘OK’.
Step 5: Wait for Excel to import your target data
Depending on your target site, and the number of data points you are looking to collect and import, this can take anywhere from a few seconds to a few minutes.
Web data analysis in Excel
Now you can start working with your data in order to extract useful insights from it. For example, you can analyze your target data using the Excel-native ‘Pivot’ and ‘Regression’ models.
Pivot, allows you to perform data analysis, create data models as well as cross-reference data sets, and derive useful insights from the information collected. It also enables you to display data sets and insights in pie/bar charts that can help you more easily communicate data trends to colleagues.
Check out this in-depth Hubspot tutorial on analyzing data sets using the Pivot function.
Regression analysis can help you understand the relationship between different inputs and outputs. For example, the correlation between the cost of an item, and Advertising spend with conversion rates. This can help make strategic decisions such as which advertising channels are most profitable (i.e. worth focusing marketing budgets on).
Automated data collection tools that output data in Excel
While anonymous proxies and proxy IP locations spread across the globe have their utility when looking to perform data collection, there are major benefits to fully automating your business’s data collection operations.
Data Collector, for example, is a leading industry tool for data scraping automation. It enables professionals that need access to information to simply choose their target website (regardless of how the information is organized) and receive data output in their format of choice, including:
- Microsoft Excel
For those of you that want to utilize Excel’s strong data analysis tools mentioned above, it is very convenient that data can be output at the click of a button directly to an Excel spreadsheet. This can be set up for 1 website or a 1,000, Data Collector scales operations up or down based on your business’s needs. It can also be programmed to collect data points as frequently or infrequently as necessary (every hour? day? week? month? year?).
Ready to have your target data delivered directly to your team’s Microsoft Excel Workbook?