New Finance Sector Survey Highlights Key Alt Data Integration Obstacles With 61% Citing Analysis Issues, And 53% Pointing To Sourcing Challenges

Learn how ‘data value chain augmentation’ is leading to higher ROI using razor-sharp data collection, and governance strategies, resulting in higher quality algorithmic output, insights, and investment decisions
6 min read
Transparency in alt-data

In this piece we will look at:

A new financial industry survey has recently been published. A hundred professionals from lending, and hedge funds, to banking, and insurance, in both the U.S. and U.K. participated. The report highlighted the fact that the vast majority of financial institutions understand that they need to rely on external data sources. However, large swaths of this dynamic sector do not possess the in-house knowledge, nor the expertise to properly analyze alternative data, and truly derive operational benefits. This article focuses on both the challenges, as well as the technological coping mechanisms, currently being used by industry players.

Infographic elaborating how much each industry relies on alt-data to function

Image source: Bright Data

What are the key obstacles to integrating alt data within the context of big finance?

‘Key alt data integration obstacles within the context of big finance include issues at the analysis level, with 61% of respondents* citing this as their most likely challenge, while 53% cited data-sourcing/procurement, as their main challenge.’

Main financial sector alt data analysis impediments

Based on survey findings:

64% of financial services professionals* use alt data as part of formulating their ongoing investment strategies.’

But things start going awry when portfolio managers start sourcing data:

  • From a large variety of sources
  • At extremely high volumes

Data analyst teams also face challenges, particularly problems pertaining to the quality, and compatibility of the data collected to feed investment algorithms. This is especially true of semi/unstructured data which is hard to integrate with data-based trading models that possess unique pre-sets. This is the crux of the issue for many financial institutions that lack the cyber infrastructure to effectively process, and cross-reference alternative data sets.

This coupled with a real lack of skilled labor that is trained in the newest data collection, and processing techniques, is creating roadblocks for companies across the financial landscape.

Lastly, financial organizations are finding it increasingly difficult to live up to the data agility standards that they have set for themselves. They collect target data, and store it in reservoirs the size of Lake Superior but still can’t seem to decontextualize it in order to draw bigger picture conclusions which can lead to more meaningful monetization opportunities.

Key data-sourcing challenges

Data analysis hardships are interlinked, and overlap in some senses with data-sourcing obstacles. The biggest data-sourcing challenges include:

Data identification– This means being able to spot, and categorically organize data sets which have retained metadata, based on use case, and asset class, which can have sizable operational implications.

Process replicability– Many times teams are able to source one-off datasets due to a certain cache of information coming online or some other singularity. Firms however need a reliable, and consistent flow of data.

Information quality– AI, and ML have an important ‘training stage’, it is in this phase that they need to be fed clean, and traceable data so that their output is high quality, and accurate. Time lag or geo-file corruption, for example, could both severely damage algorithmic insights such as which securities positions to close, and at what exit point.

Disparate sources– Data does not all come from the same place or in the same format. Some data may come from social platforms while others come from search engine results, and/or Securities and Exchange Commission (SEC) filings. Formats can range from video, and voice files to text, and system logs. Aggregating, and cross-referencing these into a unified system can be challenging.

How financial institutions are thriving, despite challenges

More and more financial institutions have started realizing the benefits of outsourcing their data collection needs. This is accomplished either in the form of purchasing ready-to-use data sets or by hooking systems, and teams up with tools that provide an automated, live stream of data.

This approach eliminates the majority of challenges experienced in the analysis stage as data collection networks are able to:

  • Tailor data sets to your firm’s specific needs
  • Ensure input/output is in your desired file format enabling a more effortless integration of data from disparate sources
  • Help you avoid investing in costly data systems, and data collection specialists
  • Give your fund the agility to turn on, and off data collection operations on a per-project basis (commonly known as ‘Data on Demand’)

This approach can also be a prudent choice for smaller, boutique offices who want to focus on investing driven by data, and not have data collection take up the lion’s share of their time.

The key benefits of treating data as a commodity in the context of financial services, and investing includes:

Getting a real-time, low latency flow of data to inform immediate, in the moment operational decisions (buy, sell, short etc)

Making AI, and ML training, and customization much more accessible, and easier so that creating, and testing rapid trading models becomes second nature to team members

Scaling up or down operations quickly, and painlessly

Eliminating the need for purchasing/developing costly hardware, software, and proprietary protocols/APIs

Unlocking hard to obtain or purposely undiscoverable data sets, as was the case with American hospitals that worked to hide procedure rates using a designated code snippet

The bottom line

Alt data collection and implementation is still in the ‘adoption stage’ in the financial sector meaning there is still room for gaining a significant ‘information advantage’.

Despite the fact that most institutions know that they have a lot to gain from leveraging external data, many lack the ability or the expertise to properly analyze, and leverage it.

It therefore follows that the ‘data value chain’ has an inverse relationship with volume, cost, quality, and output than most other informational commodities. For example, collecting a lower volume of higher quality structured data, delivered automatically to investing models, algorithms, and analysts will have a much higher Return on Investment (ROI) than high volume, unstructured data, collected from disparate sources, and formats.

It follows that financial companies, and fintechs need to have a clear data collection, and governance strategy that incorporates data-sourcing automation that enables better asset management, and portfolio structuring.

* This data is based upon respondents whose organization is using alternative data among those surveyed.