Fool’s Gold: Data Mining Digs Up Explosive Errors
Recent studies reveal the prevalence of poor-quality data, exacerbated by increased use of machine learning that allows users to dredge far bigger datasets and identify spurious correlations.

A review of 100 major psychology studies, for instance, found that only 36 percent had statistical significance. Over half the alien planets identified by Nasa’s Kepler telescope turned out to be stars. And in preclinical cancer research, a mere six out of 53 breakthrough studies were found to be reproducible. Quantitative finance does not fare much better.
“It’s a gigantic problem—spurious results are the norm,” says Zak David, co-founder of analytics firm Mile 59, and former engineer of high
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Standard Chartered CDO on AI, CAT on life support, Paxos files for clearing status, and more
The Waters Cooler: FIX updates MMT, a Finnish datacenter hangs in the balance, and partnerships galore in this week’s news roundup.
Waters Wavelength Ep. 327: Standard Chartered’s Mo Rahim
He joins the podcast to discuss data and AI governance and guardrails for AI.
Messaging’s chameleon: The changing faces and use cases of ISO 20022
The standard is being enhanced beyond its core payments messaging function to be adopted for new business needs.
S&P Global details AI partnerships, LLM advancements
The data provider has partnered with Microsoft and Anthropic to use hyperscaler tech to boost its AI offerings.
The industry is not ready for what’s around the corner
Waters Wrap: As cloud usage and AI capabilities continue to evolve (and costs go up), Anthony believes the fintech industry may face a similar predicament to the one facing journalism today.
Overbond’s demise hints at cloud-cost complexities
The fixed-income analytics platform provider shuttered after failing to find new funding or a merger partner as costs for its serverless cloud infrastructure “ballooned.”
ViaNexus aims to bring data entitlement control to MCP, agentic AI
The startup believes that Anthropic’s Model Context Protocol marks a major step forward for agentic AI, but the market data industry has its own complexities that haven’t been addressed—yet.
IEX automates reporting and billing with DataBP
The exchange has enlisted DataBP to help put structure around its reporting and billing—and potentially how clients subscribe to its data.
Most read
- Paxos files to become SEC-registered clearing agency
- CAT on life support after appeals court ruling
- ‘AI for everyone, everywhere, with everything’