Heavy lifting: Why using AI for data extraction is still no easy task
Using AI to extract data from documents and filings should be a no-brainer. But it takes a lot of brains and money to get those processes set up and running reliably and accurately.

Data extraction and AI experts are warning that attempting to use generative AI models to parse data from public company documents such as SEC filings could prove costly and—more importantly—may not deliver accurate results. Specifically, they say financial firms that need very high standards of accuracy should not try to develop solutions using generic AI tools and large language models (LLMs) but will need to invest time and money to build highly custom services.
Patronus AI, which provides a
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Emerging Technologies
LSEG’s private funds platform, Microsoft’s new datacenter, and more
The Waters Cooler: New private markets solutions, M&A activity, and a sprinkle of DLT in this week’s news roundup.
BlueMatrix acquires FactSet’s RMS Partners platform
This is the third acquisition BlueMatrix has made this year.
Waters Wavelength Ep. 331: Cresting Wave’s Bill Murphy
Bill Murphy, Blackstone’s former CTO, joins to discuss that much-discussed MIT study on AI projects failing and factors executives should consider as the technology continues to evolves.
FactSet adds MarketAxess CP+ data, LSEG files dismissal, BNY’s new AI lab, and more
The Waters Cooler: Synthetic data for LLM training, Dora confusion, GenAI’s ‘blind spots,’ and our 9/11 remembrance in this week’s news roundup.
Chief investment officers persist with GenAI tools despite ‘blind spots’
Trading heads from JP Morgan, UBS, and M&G Investments explained why their firms were bullish on GenAI, even as “replicability and reproducibility” challenges persist.
Wall Street hesitates on synthetic data as AI push gathers steam
Deutsche Bank and JP Morgan have differing opinions on the use of synthetic data to train LLMs.
A Q&A with H2O’s tech chief on reducing GenAI noise
Timothée Consigny says the key to GenAI experimentation rests in leveraging the expertise of portfolio managers “to curate smaller and more relevant datasets.”
Etrading wins UK bond tape, R3 debuts new lab, TNS buys Radianz, and more
The Waters Cooler: The Swiss release an LLM, overnight trading strays further from reach, and the private markets frenzy continues in this week’s news roundup.