GenAI and data quality converge at T. Rowe Price’s (Data) Lake Como
Jay Como was hired by T. Rowe Price in 2023, just as generative AI was rippling across the capital markets. Rather than having data quality as a hindrance to AI development, his team wants to use AI and agents to help solve this long-standing issue.
The year was 1992, and Jay Como and three friends set out to leave their hometown in Maine and cross the United States in two cars.
After arriving in Arizona, the foursome crammed into a two-bedroom apartment and went looking for work, needing money for the essentials: rent, food, gas, and beer. As luck would have it, the woman behind the desk at the temp agency had four jobs that needed filling. Two were in a warehouse, and two were at Security Pacific Bank.
With the four men sharing two cars, they would need to take the jobs in pairs. The owner of one of the cars (not Como), raised his hand and said, “I’ll take the warehouse.” Another friend (not a car owner; also not Como) said, “Yeah, I’m with him.”
“I guess that leaves the bank job,” said Como, the other car owner, who looked at Kevin, his new coworker by default.
“What’s the bank job?” they asked.
“You’ll be working in the bank’s vault, but they’ll give you more details when you show up tomorrow,” the woman said.
Call it the snowball effect or the butterfly effect, but seemingly small things often become bigger over time. In this case, it was a matter of hands—when they were raised, who raised them—that made Como a banker and, eventually, an authority on financial data. More than 30 years later, Como and his friend are still working in finance, though they’ve come a long way from the depths of a vault.
“I wore a lot of hats throughout most of my career,” he says today. “But strangely, toward the second half of my career, I’ve really settled into what I’ll probably do for the next 10 years, maybe longer if they’ll have me. Data is something that I feel like I’m pretty good at and I’m really passionate about—it really excites me.”
‘Rinse & repeat’
By the time Como arrived at T. Rowe Price in 2023, he had gone from being a temp at Security Pacific Bank to becoming something of a regulatory data specialist at Bank of America, which acquired SPB the year Como arrived. Other stops along the way were JP Morgan, Barclays, the Federal Reserve Bank of San Francisco, E*Trade Financial, and Charles Schwab. Operations roles. Data roles. Eventually, chief data officer roles.
In 2020, he joined the now-infamous Silicon Valley Bank as a CDO. For Como, it was a painful and surprising experience, to be sure, but he walked away feeling more confident as a data professional. And he landed on his feet at one of the largest asset managers in the world.
When Como arrived at T. Rowe, a few important transformations were already in motion. The firm had drawn a hard-line connection between the chief data office and technology (as opposed to a dotted line), and the market data services unit was also connected as a hard line. As was engineering. And when Como arrived, they were getting the foundation in place for a data governance program, creating data products, and building out its analytics capabilities.
I envision a low-code solution using agentic AI to remediate data issues at source—in effect, repairing the data in the same fashion as the models create metadata
Jay Como, T. Rowe Price
The asset manager had made a strategic decision to invest in what Como—T. Rowe’s global head of data governance and market data—calls “a bona fide chief data office [with] industrial-grade data products, data governance, and data quality.” Como’s job was to look at T. Rowe’s data management practices and operating model, and then to launch “foundational data products, federated data governance and operations, and centrally provided standards, policies, and tooling.”
Definition 1: Data product [noun]: A data product, Como says, should have governance in the form of lineage, metadata, data quality, and data owners with business-capability features such as an owner, formal intake, backlog, roadmap, and compute logic driving business value.
Definition 2: Tooling [noun]: This is your data quality platform, he says. Your master data management platform. And embedded in this ecosystem is a lineage tool and an issues management tool. And, of course, there’s a robust data catalogue.
Como developed a federated target operating model and began appointing divisional data officers (DDOs). The data office educated these DDOs to “operationalize” their unit’s data.
“Over time,” Como says, “we stood up over 1,000 net new data quality rules and a significant population of key data elements. Every key data element now has data owners and 30 points of metadata.”
The team also created “data quality ratings,” which are essentially scorecards that let users see how confident and complete a dataset is. After the system was shown to the board of directors in 2024, it was approved for use in data remediation.
Now in place, “it’s kind of rinse-and-repeat at this point,” Como says.
Meta-morphosis
In 2023, T. Rowe was worried about governing the budding field of generative AI. It was not alone in its worry. Many banks enacted bans against the seemingly out-of-nowhere technology. By the time another year passed, GenAI was a selling point for those same banks.
In the highly regulated capital markets, trading houses had to create policies, standards, and controls to govern the use of GenAI; it couldn’t be a free-for-all. Como and his team had to protect against copyright infringement. Gen AI hallucinates, and they had to answer questions like, what’s an acceptable tolerance level, and for which functions? And now that T. Rowe had a data quality rating system in place, how would they assess the quality of something generated by AI?
These are easy questions to understand, but less easy to answer, Comos says, which may be why CDO roles have been perilous on Wall Street: “You’re the CDO! Why aren’t we getting immediate, magnificent results using GenAI?!”
In many cases, the culprit is metadata.
Metadata has often been overlooked, especially by non-data professionals. But Como is “obsessed” with it.
Today, a T. Rowe user can drop their data into a large language model that applies the firm’s standards and ETL (extract, transform, load) scripts, which will structure the metadata to about 80%, and a human will finish the rest.
“What took a year and a lot of additional resources now takes a week or two, depending on how long it takes to provision the data,” Como says. “Now, the hard part isn’t structuring the metadata; the hard part is getting access to the data and following security and privacy protocols. Despite this metadata game-changer, I suspect firms will still have a bit of a challenge finding data.”
Last year, Como spoke at a conference held by Gartner for the New York chapter of chief data and analytics officers. He had prepared notes that, he thought, were up to date.
He said his firm would take what it had learned about using LLMs for data structuring, and the next iteration would be LLMs learning patterns, constraints, and relationships through training and inference to create data quality checks. Como thought his team was mere months—if not weeks—away from achieving this.
He was wrong, delightfully so. He came back from New York and said that the team had already achieved what he said the firm would eventually do.
Technological progress in the capital markets, at least lately, is fast. Once you know how data behaves, and data quality rules can be executed with very little human intervention, the next frontier is agentic AI, which can autonomously handle remediation.
“I envision a low-code solution using agentic AI to remediate data issues at source—in effect, repairing the data in the same fashion as the models create metadata,” Como adds. “This low-code source data remediation should occur this year.”
In the old days—you know, about four years ago—data collection was highly manual. A human spent their days digging through Microsoft Excel files or whatever bootleg database was slapped together to find the data needed for regulatory reports, risk management, and trading. Often, consultants were brought in to help because you had to throw bodies at the problem.
A human would then take that data, put it into their catalog, and likely recertify it using something like Collibra. There was usually a data quality platform, but it was independent of anything else. Everything was fragmented.
Things are changing. Now you can build a gateway into your data catalog, Como says, and you can link your data catalog to the model’s metadata, and that data can be easily sourced.
“Now there’s a triumvirate for your catalog to use AI to go out and triage any deltas. That in-sync, continuous AI-powered monitoring did not exist before GenAI.”
This means that data quality and remediation processes are becoming increasingly automated. But, Como warns, “there are still big things to solve.”
Hub-and-spoke
As one might expect from any highbrow buy-side shop, T. Rowe’s quants, research directors, and portfolio managers are all, to use Como’s words, “incredibly sophisticated.” But that ability to “go-it-alone” can also create problems.
They knew how to use the data and where it needed to be repaired, so they’d fix it. If an instrument was missing a legal entity identifier (LEI), potentially leading to a bad trade, it would be fixed—sometimes many times over. If multiple teams were looking at the same piece of data, each one would probably go to Gleif, find the appropriate LEI, and then enter it into their trade system.
“There became an internal cottage industry of data operations teams that were fixing the data breaks, but it was decentralized.”
They needed to instead fix the data centrally so that changes would apply to teams across all asset classes and regions. Como and his team made an appeal to the other divisions: We know that you can fix it on your own, but let us fix it centrally so it makes all our lives easier. Win-win.
Building more than 1,000 data quality rules against key data elements is a really thorny process
Jay Como, T. Rowe Price
“To be frank, we’re still in the early days of this journey,” Como says. “Building more than 1,000 data quality rules against key data elements is a really thorny process. We fixed a bunch of data, but as you normally see in the transition from greenfield to a bona fide data operation, a lot of the harmonization is really about fixing rules. You’re getting high volumes of data defects, and you’re going, ‘Oh, this isn’t really a defect, we have to tweak the rule.’”
Como says his team has enriched key data with up-to-date LEIs, product codes, and obliger IDs—the nuts and bolts of financial data. The business had to trust the data folks…which hasn’t exactly been an easy sell throughout the history of Wall Street. To foster that trust, the data folks need to be transparent.
This is where the DDOs—divisional data officers—come into play.
A DDO is the person who cares most about data in any given vertical, whether that’s fixed income, corporate functions, legal, human resources, operations, or something else. For a year, each of these people would meet regularly to discuss the importance of data policies and standards. They learned about data quality rules and issues management. When the data catalog was launched, they explained metadata to people who might not normally care about it.
They formed a taskforce. While they were majors, colonels, and lieutenants in their respective siloes, they were the equivalent of a military’s non-commissioned officers (NCOs), who keep things together on the ground when everything goes awry.
While Como declines to give an exact number, T. Rowe Price has appointed multiple DDOs since last July.
“Now we’re at the point where it’s like, ‘Okay, DDOSs, you are now the chief data officer for your business—here’s what you need to do. Everything we’ve been talking about for the last 15 months, you need to start doing in your business. Here is some tooling we can give you, the catalog, here are the policies and standards, here are some data quality tools we can give you, and we’ll help you. But it’s your responsibility now, and now there is a mandate for you to govern your data.’”
Since that mandate was handed down, Como has been most surprised that senior executives are quickly buying in and spreading the gospel. A period of rapid change can help deliver a sermon.
“It was not hard to get them there,” Como says. “They knew GenAI was coming; they knew that they had data problems that were tiresome to fix; and they knew that there was a better way. And I really think they wanted to pre-empt being slow to adopt GenAI, so it made it easy to recruit them and get them engaged.”
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Pennsylvania entity files antitrust suit against Cusip Global Services
Complaint challenges CGS’s position as the US national numbering agency.
Market data cost increases slow, but prices still outmatch budgets
The market for market data is in flux as procurement teams are buoyed by C-suite attention, AI, and competitive tension. But providers are trying to protect their moat.
LSEG ‘acted in bad faith’ in MayStreet acquisition, new court filing alleges
Lawyers for Patrick Flannery have responded to LSEG’s motions to dismiss the suit for fraud and breach of contract related to the 2022 acquisition.
Bloomberg enhances feeds, Standard Chartered and TP Icap partner on digital assets, and more
The Waters Cooler: LSEG and ASX partner to modernize derivatives platform, MSCI acquires two companies, State Street bolsters data business, and more in this week’s news roundup.
Wilshire Indexes shutters, transfers operations
Investment firm Wilshire has told clients that production and publication of all indexes not already sold or returned to the asset manager’s ownership will be discontinued.
After Dora, ITRS pursues agentic AI for autonomous monitoring
Chief product officer says firms can bolster data resilience with new forms of AI.
Geopolitics hits Middle East datacenters and firms’ operations
The IMD Wrap: Wei-Shen examines recent disruptions to AWS datacenters in the Middle East linked to the US-Israel strikes on Iran, and what it means for data and businesses operating in the region.
CME rankles market data users with licensing changes
The exchange began charging for historically free end-of-day data in 2025, angering some users.