What is alternative data?
The term refers to any information about a financial instrument from non-traditional sources. Conventional sources used by investment professionals typically include SEC filings, financial records, press releases, and media reports.
Sometimes, analysts need to complement or support their insights, so they will try to find information from other sources including news sentiment, expert networks, and web-crawled data, for example, is called ‘alternative data’.
In this context, there are two main sources of data for investment managers to draw on:
1. Traditional data – financial reports, news, trading reports, SEC filings
2. Alternative data – Payments, geolocation, social media, and satellite
How is alternative data generated?
Alternative data can come from many sources, as we explained previously.
But who creates this data?
There are three main alternative data sources:
1. Individuals: Regular people generate huge amounts of data every day, through social media interactions, work, as well as Google, Bing, and Yahoo searches. Every time someone posts a comment, or a review on an eCommerce site they create alternative data indicative of behavioral patterns. This data is considered ‘unstructured alt data’ and can be used as one of many factors in the corporate decision-making process.
2. Companies: Businesses, on the other hand, tend to create structured data which is easier to analyze and can provide deeper insights when making financial decisions. This includes transactional data – data that is generated as a result of a purchase, credit card transaction, or the like. Data from government agencies, taxes, and so on is also included in this group.
3. Internet of Things (IoT) generated data: This data is generally unstructured since it is generated by sensors and endpoint devices. IoT devices like smart TVs, Point of Sale (POS) systems, parking, and traffic sensors provide useful data that if properly analyzed can provide you with powerful insights. For instance, how frequently people pass through a certain street or how often customers visit a certain mall. Included in this group is data generated by cell phones and other geolocation-based systems.
The different types of alternative data
Web data – web searches, click-through rates, web demographics. This is especially useful for marketing and eCommerce research.
Social sentiment data – consumer behavior and reactions to brands’ content and positioning. This includes comments, online interactions, tweets, and posts. This can point you in the direction of current market trends and changes in consumer behavior.
Geolocation data – this type of data can help companies understand which locations have a higher demand for specific products, for example. Real estate investors can also use this type of data to identify areas with a positive outlook for project development based on alt data points such as zoning regulations or new infrastructure being built.
Credit card transactions – transactional data can track retail revenue and payment habits for loan evaluation, in order to preempt retail business earning reports and identify consumer discretionary spending patterns.
Point of Sale (POS) transactions – can provide information about sales volume, consumer behaviors, which products are popular, as well as which payment methods are preferred among different consumer segments.
Weather and satellite imagery – while this data is mainly collected raw (as images), it can be fed to algorithms and/or analytical tools that can draw concrete conclusions and predictions. For example, measuring the economic activities of a certain area or demographic including the time of day when activity peaks as well as the number of stores that are open or active (this was especially useful during coronavirus in helping people avoid crowded stores and thereby decreasing infection rates).
Why is alternative data so popular?
Investment management firms leverage data to identify patterns and gain unique insights about investment products. Hedge funds were among the first in taking advantage of data analytics technologies and big data, closely followed by private equity managers. These same ‘avant-garde’ firms are leading the charge in terms of alternative data – first adopters are positioned to benefit the most, before alternative data enjoys widespread adoption.
What makes alternative data so attractive?
The massive datasets available offer a potential advantage over competitors. The amount of global data generated is expected to reach 163ZB by 2025. That means more data to feed Artificial Intelligence [AI] tools, more potential patterns and trends to discover, and more possibilities of gaining an edge over competitors.
With that in mind, investment companies are hiring data scientists and analysts at an ever-growing rate in order to help with such data-mining efforts. According to the Financial Times, the number of data analysts in investment firms is growing exponentially.
What is the role of alternative data in model-driven investing?
Model-driven investing refers to the use of analytical data models in order to find insights for the financial sector in general, and investing in particular. While most firms have yet to completely depart from traditional data sources, alternative data is becoming increasingly important for investment firms trying to identify innovative, new ideas to generate increased alpha.
Quick definition: According to Investopedia “Alpha (α) is a term used in investing to describe an investment strategy’s ability to beat the market or its edge.”
Furthermore, due to COVID-19, there was a greater shift towards online activities and a digital marketplace that is driving banks and investors to turn to alternative data as a source for decision-making. This type of data can provide a close to real-time picture that lets financial institutions come to timely decisions about risk management, loans, and so on.
Achieving a quant or model-driven investing approach consists of two parts: gathering and analyzing the data. Collecting the data can be done by using data collecting and web crawling tools, data platforms, and data providers that specialize in collecting alternative data.
However, finding the data is just the first step. Data can work for you only after it is analyzed and interpreted. Since alternative data comes from disparate sources and is unstructured, it can be more difficult to analyze than traditional data. The rise of Machine Learning [ML], and Natural Language Processing [NLP] tools are critical to analyzing the huge datasets generated by alternative data. AI tools can process data at a much faster pace than any human. AI-based models and data providers can assist the investment industry in finding the patterns and insights necessary to make accurate decisions.
Alternative data use cases
Alternative data will transform the way investment companies and hedge funds select investments over the coming years. The applications of alternative data for generating ideas, evaluating investments, and managing portfolios can be powerful when combined with data analytics tools. Below are some common use cases for alternative data:
Tracking price changes and inflation – firms can track datasets with millions of prices to understand price changes and the effects of inflation.
Use social media to predict earnings – an asset manager can mine social media and search engine data to forecast a company’s earnings over the course of a specific period of time.
Payment data to track performance – a hedge fund can use combined data, such as credit card transactions, location data, and app usage to track the online and app sales performance of a retail company.
Web data and social media comments to forecast market movements – you can use data from crawled websites and social media to detect events that may move the market.
How to source alternative data
You can obtain alternative data by collecting it directly from the internet by yourself or by buying from a third-party vendor. Let’s explore.
Option 1: Web crawling
This refers to collecting data from websites via a web scraping tool, or in-house scraping software. The software crawls the web pages, downloading the relevant data according to specific keywords. The data can then be saved in a number of formats such as a CSV file, for example. The applications for data scraping tools are wide, from brand protection to price verification.
You have quite a few options in terms of tools – from Do It Yourself (DIY) solutions in which case you would integrate with a proxy network and leverage real consumer IPs. Other options include fully automated solutions that require zero coding or infrastructure – all you need to do is:
- define your target data sets
- desired format
- schedule
- preferred delivery method
and enjoy a live stream of data directly to your team and/or systems.
Option 2: Buying data sets
There are alternative data providers that can supply you with data at different stages of processing. You can buy the data raw, ’clean’ or semi-structured, for example. This is a good option for companies that need ‘static’ data sets meaning it is not crucial for them to have real-time data sets being fed to their team and systems. For example, a fashion house may want to crawl social media once a season to identify new trends but they do not need to do this on a daily basis. On the other hand, an eCommerce business may want to scan competitor pricing on an hourly basis and make live changes in order to undercut or ‘outprice’ the competition. In the latter case, buying data sets may be a less viable option.
What’s next?
As companies identify the value that alternative data has to offer the economy at large and the financial sector, in particular, we will start to see widespread adoption of alt data prediction models, and alt data-driven revenue streams. When companies learn to collect:
- accurate
- clean
- user-generated
alternative data and add a level of sophistication by feeding these raw data sets to predictive algorithms and Artificial Intelligence, we will really begin to see the impact.
Imagine retail chains that know and make production and collection decisions based on consumer sentiment on social media.
Imagine investment houses that know and invest in or against (aka shorting) securities based on consumer activity derived from transactional data in real-time.
Imagine insurance companies that can perform a risk assessment based on natural phenomena geospatial data (think hurricanes, tsunamis, and floods).
You no longer need to use your imagination. The above are real use cases of alt data being leveraged by visionary companies who have decided to lead their industry instead of being led by others.