JP Morgan Data Scientist Eyes Insights from Internal Data

- By Faye Kilburn
- @Fayekilburn
- 05 Jul 2018

The firm’s three flagship real estate funds—which buy individual properties rather than portfolios—manage roughly $50 billion in assets globally. Over the years, the funds have naturally amassed research on the US real estate market that could now bear fruit for the rest of the business, says Ravit Mandell, chief data scientist for JPMAM’s intelligent digital solutions division, who joined the asset manager to lead its Big Data and deep learning efforts last year from JP Morgan’s corporate and investment bank, where she was head of the quantitative market making and swap trading desk, responsible for macro market strategy, the strategic investments portfolio, Big Data analytics, and market structure initiatives.

“Whenever [the funds] buy a mall, we collect data on the per-square-foot purchase cost, the foot traffic, when was the last time they changed the air conditioning or painted the parking lot, so we have a lot of internal data in the US,” she says.

As well as providing this data to its own investment team to steer the analysts on the real estate sector and the health of the broader US economy, the firm may give it to institutional clients to inform their broader real estate allocations. Using data mining techniques, the firm aims to find previously unexploited uses for data held within the group.

Knowing Where to Look

The asset manager initially held alignment sessions to determine what intellectual property exists in the firm. “We have hired data strategists and engineers to find data internally. Someone, somewhere always has it,” Mandell says.

The hunt for internal data is part of a broader effort to build a data science culture at JPMAM, where fundamental analysts and traders incorporate machine learning and non-traditional forms of data in their daily decision making.

The firm’s data efforts are housed in its so-called “data lab,” currently staffed by 21 data scientists and engineers—most of whom have been hired in the past three months—though Mandell expects headcount to double over the next 18 months.

The group plans to apply machine learning analysis to proprietary information and internal data from its investment bank parent, and is looking into applying sentiment analysis to more than 100,000 company research reports written by JPMAM analysts, who are required to write reports on any company they meet. Mandell says this could provide proprietary insight for JPMAM’s investment process.

“The investment bank has been around for 150 years, and has collected so much data. As a bank-owned asset manager, we have a lot of internal data. But when it comes to data mining, the question is how to make all of that data useful,” she adds.

War of the Words

To date, the firm has been using so-called “bag of words” language processing to analyze text in earnings call transcripts, primarily to predict revisions in earnings estimates for its emerging markets and Asia-Pacific equities, behavioral finance equities and beta strategies teams. This method assigns sentiment scores to individual words but due to its “binary” nature, cannot handle some complexities of human language. For instance, it fails to judge the neutral sentiment of a phrase like “it’s not the worst,” Mandell says.

So, the asset manager is rolling out a new natural language processing model based on recurrent neural networks used by Google and Apple’s Siri. The technology can memorize inputs and make predictions—as opposed to “feedforward” neural networks, which are primarily used for pattern recognition—and will be able to understand the sentiment of phrases in analyst reports and other documents, in the context of the fund manager’s business. To train the neural network, the firm has built JP Morgan-specific dictionaries, which feed the language processing models. For example, the new models will be able to assign a negative sentiment to a term like “higher deficits,” whereas previous models would incorrectly deduce the term to be both good (higher) and bad (deficits).

Mandell says the new model will allow the firm to extract information from research notes, broker reports and filings to help anticipate company earnings revisions; gain insight from corporate filings (e.g., 10-Ks, 10-Qs, prospectuses, etc.) to enhance its research efforts; and use temporal clustering and sentiment analysis on news to better understand what’s being discussed and where consensus is forming.

For five years, JP Morgan operated a centralized data science unit that oversaw data architecture and strategy for the entire group in one place. However, the group is now diversifying to integrate data science capabilities into its four business lines: central investment banking, corporate client banking, asset and wealth management, and technology. “The math behind it has existed for many years but now we have the compute power to figure things out that we couldn’t two years ago,” Mandell says.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe

You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.

If you would like to purchase additional rights please email info@waterstechnology.com

You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.

If you would like to purchase additional rights please email info@waterstechnology.com

More on Data Management

EU, US consolidated tape efforts pass important milestones

The IMD Wrap: Europe is setting up its first consolidated tapes of data, while the US is revamping its tapes into one. Both initiatives should bring greater transparency and efficiency to the capital markets.

10 Jul 2025

Exchange M&A, US moratorium on AI regs dashed, Citi’s “fat-finger”-killer, and more

The Waters Cooler: Euronext-Athex, SIX-Aquis, Blue Ocean-Eventus, EDM Association, and more in this week’s news roundup.

04 Jul 2025

EDM Council expands reach with Object Management Group merger

The rebranded EDM Council now includes members from industries outside financial services.

01 Jul 2025

As datacenter cooling issues rise, FPGAs could help

IMD Wrap: As temperatures are spiking, so too is demand for capacity related to AI applications. Max says FPGAs could help to ease the burden being forced on datacenters.

26 Jun 2025

Bloomberg introduces geopolitical country-of-risk scores to terminal

Through a new partnership with Seerist, terminal users can now access risk data on seven million companies and 245 countries.

26 Jun 2025

A network of Cusip workarounds keeps the retirement industry humming

Restrictive data licenses—the subject of an ongoing antitrust case against Cusip Global Services—are felt keenly in the retirement space, where an amalgam of identifiers meant to ensure licensing compliance create headaches for investment advisers and investors.

23 Jun 2025

Two eyes in red and green staring intensely at each other, suggesting a silent yet absurd confrontation in communication. - stock photo

LLMs are making alternative datasets ‘fuzzy’

Waters Wrap: While large language models and generative/agentic AI offer an endless amount of opportunity, they are also exposing unforeseen risks and challenges.

20 Jun 2025

Cloud offers promise for execs struggling with legacy tech

Tech execs from the buy side and vendor world are still grappling with how to handle legacy technology and where the cloud should step in.

18 Jun 2025