Michael Shashoua: Shadow Play
Michael Shashoua looks at dark data and suggests its value is based on 'meaning' residing within large datasets.
The term "dark data" is becoming a buzzword in data management, in the same manner as big data rose in consciousness five years ago. Like big data before it, the term dark data raises more questions than answers, at least so far, about what it means.
In a feature in the February issue of Inside Reference Data, Thomson Reuters' resident data management expert Tim Lind said that making connections between existing pieces of data is the definition of dark data. Accessibility of the data is only the first part of the problem, and solving that still leaves firms having to figure out what predictive value the data may have once put into a fuller context.
That is similar to the struggle that firms have experienced with big data. Often when big data has been used in financial services operations discussions—particularly in data management operations discussions—it pertains to what is being done with data, such as how firms mine larger data volumes.
Just as the industry shouldn't get caught up in big data hysteria, it should also be cautious about overreacting to dark data. The most important thing is to know exactly what data you have, how frequently you are getting it, and what you are trying to achieve with it.
"Dark data could be a more appropriate term than big data," says Norbert Boon, executive director of Flytxt, a consultancy in Amsterdam that works on big data analytics issues. "Big data is the structured and unstructured data one stores, for which there is no immediate purpose," he says. "Dark data is much more appropriate—it's the value hidden inside. Finding that is the challenge."
Just as the industry shouldn't get caught up in big data hysteria, it should also be cautious about overreacting to dark data. The most important thing is to know exactly what data you have, how frequently you are getting it, and what you are trying to achieve with it.
Customer Data Grows
Another mark against the relevance of big data is the possibility that data generated in our industry is not as large in volume as other industries, such as the telecoms industry. Increased activity around know-your-customer (KYC) data in the financial sector makes this assertion a question rather than a certainty, however. The global data and messaging services provider Swift recently unveiled its KYC Registry, launched in December 2014, which is still growing. The registry is already being upgraded with a new profile feature to collect correspondent firm activity data, making its customer information more complete to better support risk and exposure management.
It could be said that this meets the definition of dark data that Lind proposes: connecting pieces of data that already exist, but haven't been appropriately linked to show a more complete picture of what's occurring in the market or in a firm.
Identifiers' Impact
Also on the subject of knowing what's going on within one's firm, the global legal entity identifier (LEI) initiative, which reached 340,000 registrations at the start February this year, is likely to eventually end up three times that number, according to Bill Hodash, managing director of business development at the Depository Trust & Clearing Corporation (the DTCC and Swift operate the Global Markets Entity Identifier utility that has handled about half of those registrations).
There may, however, be another spur to accelerate the growth of LEIs—more regulatory action. In Europe, Mifid II and the Central Securities Depositories Regulation, already began fueling more LEI issuance in 2014, and the US Securities and Exchange Commission's new amendments to Regulation SBSR in early 2015, which include LEI registration requirements for security-based swaps, are likely to raise registrations yet further.
This suggests that the dark data or big data stories are far from finished. If the number of LEIs to be managed goes far beyond the current expectation of about 1.5 million, that could create higher data volumes and more pieces of information that end up in different silos within firms.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Emerging Technologies
Data standardization is the ‘trust accelerator’ for broader AI adoption
In this guest column, data product managers at Fitch Solutions explain AI’s impact on credit and investment risk management.
BNY inks AI deal with Google, Broadridge moves proxy voting to AWS, Expero delivers ICE market data, and more
The Waters Cooler: TSX Venture Exchange data hits the blockchain, SmartTrade acquires Kace, and garage doors link to cloud costs in this week’s news roundup.
Everyone wants to tokenize the assets. What about the data?
The IMD Wrap: With exchanges moving market data on-chain, Wei-Shen believes there’s a need to standardize licensing agreements.
Google, CME say they’ve proved cloud can support HFT—now what?
After demonstrating in September that ultra-low-latency trading can be facilitated in the cloud, the exchange and tech giant are hoping to see barriers to entry come down.
Waters Wavelength Ep. 342: LexisNexis Risk Solutions’ Sophie Lagouanelle
This week, Sophie Lagouanelle, chief product officer for financial crime compliance at LNRS, joins the podcast to discuss trends in the space moving into 2026.
Citadel Securities, BlackRock, Nasdaq mull tokenized equities’ impact on regulations
An SEC panel of broker-dealers, market-makers and crypto specialists debated the ramifications of a future with tokenized equities.
BlackRock and AccessFintech partner, LSEG collabs with OpenAI, Apex launches Pisces service, and more
The Waters Cooler: CJC launches MDC service, Centreon secures Sixth Street investment, UK bond CT update, and more in this week’s news roundup.
Tokenized assets draw interest, but regulation lags behind
Regulators around the globe are showing increased interest in tokenization, but concretely identifying and implementing guardrails and ground rules for tokenized products has remained slow.