Experts Warn on Data Quality for AI Projects

Speakers at the Buy-Side Technology North American Summit said a single source of quality data is where AI projects should start.

Data flow

Data governance strategies that rationalize data, prioritize lineage, and put a premium on quality are necessary before moving into artificial intelligence (AI) and machine learning, experts say.

Speakers at the Buy-Side Technology North American Summit, held in New York on October 8, said that data continues to be the important missing piece people forget before undertaking AI projects.

For Julia Bardmesser, senior vice president and head of data at Voya Investment Management, AI projects are useless without an understanding of the data fed to them.

“Before you use AI, you need to get your data together. It’s cool, but you need to have the basic understanding of the data you have, know what data is good and what it looks like,” Bardmesser said. “You can run your machine learning on any pretty much any data you can source, but how much are you going to trust the results if you don’t know where it came from?”

Many of the panel speakers are already working on AI and machine-learning programs, and have encountered the pitfalls of a lack of quality data and the lack of stringent data management. As a result, they have undertaken projects around this, they said.

“What I’ve heard is that if you have a lot of data, quality is not important but that’s just not true,” Bardmesser added. Indeed, that can even complicate the issue further, according to Phillip Dundas, head of technology and change at Schroders. Quality is paramount, he explained.

“We’re going through the process of rationalizing the number of data masters or data sources we have, because we’ve seen that exact problem,” Dundas said. “People wanted their data close to them, but what that ends up producing [is] inconsistent results. That’s the last thing you want when dealing with clients, presenting one dataset that looks like this and another that looks different.”

He added the company has moved to have a single master list for data around clients or investment within the entire firm.

All speakers noted AI and machine learning has great benefits for their firms even if there are data issues to contend with.

“Machine learning is the killer app that will ultimately take us on the operational side towards that final nirvana of eliminating the last mile of manual processing,” said Michael McGovern, enterprise chief information officer for Brown Brothers Harriman. “The tools are great, but it’s the expertise that is important, so we all need to become citizen data scientists.”

Voya’s Bardmesser noted just that putting bots to work in automating many operations processes has helped make operations more attractive to employees. She said talent retention increased because the most manual and tedious processes were eliminated.

Bardmesser and the other panelists, however, warned that it is not just data that’s important for AI and automation, being able to optimize processes is needed so as not to automate something that is inefficient in the first place.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact or view our subscription options here:

You are currently unable to copy this content. Please contact to find out more.

Tick History – Query: Looking back to the future

The advantages of cloud-based services is well documented, from reduced upfront and ongoing operating and infrastructure costs to improved time-to-market for new services and datasets. Here, Tim Anderson, LSEG explains how the benefits of the service…

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here