Michael Shashoua: Shadow Play
Michael Shashoua looks at dark data and suggests its value is based on 'meaning' residing within large datasets.
The term "dark data" is becoming a buzzword in data management, in the same manner as big data rose in consciousness five years ago. Like big data before it, the term dark data raises more questions than answers, at least so far, about what it means.
In a feature in the February issue of Inside Reference Data, Thomson Reuters' resident data management expert Tim Lind said that making connections between existing pieces of data is the definition of dark data. Accessibility of the data is only the first part of the problem, and solving that still leaves firms having to figure out what predictive value the data may have once put into a fuller context.
That is similar to the struggle that firms have experienced with big data. Often when big data has been used in financial services operations discussions—particularly in data management operations discussions—it pertains to what is being done with data, such as how firms mine larger data volumes.
Just as the industry shouldn't get caught up in big data hysteria, it should also be cautious about overreacting to dark data. The most important thing is to know exactly what data you have, how frequently you are getting it, and what you are trying to achieve with it.
"Dark data could be a more appropriate term than big data," says Norbert Boon, executive director of Flytxt, a consultancy in Amsterdam that works on big data analytics issues. "Big data is the structured and unstructured data one stores, for which there is no immediate purpose," he says. "Dark data is much more appropriate—it's the value hidden inside. Finding that is the challenge."
Just as the industry shouldn't get caught up in big data hysteria, it should also be cautious about overreacting to dark data. The most important thing is to know exactly what data you have, how frequently you are getting it, and what you are trying to achieve with it.
Customer Data Grows
Another mark against the relevance of big data is the possibility that data generated in our industry is not as large in volume as other industries, such as the telecoms industry. Increased activity around know-your-customer (KYC) data in the financial sector makes this assertion a question rather than a certainty, however. The global data and messaging services provider Swift recently unveiled its KYC Registry, launched in December 2014, which is still growing. The registry is already being upgraded with a new profile feature to collect correspondent firm activity data, making its customer information more complete to better support risk and exposure management.
It could be said that this meets the definition of dark data that Lind proposes: connecting pieces of data that already exist, but haven't been appropriately linked to show a more complete picture of what's occurring in the market or in a firm.
Identifiers' Impact
Also on the subject of knowing what's going on within one's firm, the global legal entity identifier (LEI) initiative, which reached 340,000 registrations at the start February this year, is likely to eventually end up three times that number, according to Bill Hodash, managing director of business development at the Depository Trust & Clearing Corporation (the DTCC and Swift operate the Global Markets Entity Identifier utility that has handled about half of those registrations).
There may, however, be another spur to accelerate the growth of LEIs—more regulatory action. In Europe, Mifid II and the Central Securities Depositories Regulation, already began fueling more LEI issuance in 2014, and the US Securities and Exchange Commission's new amendments to Regulation SBSR in early 2015, which include LEI registration requirements for security-based swaps, are likely to raise registrations yet further.
This suggests that the dark data or big data stories are far from finished. If the number of LEIs to be managed goes far beyond the current expectation of about 1.5 million, that could create higher data volumes and more pieces of information that end up in different silos within firms.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. Printing this content is for the sole use of the Authorised User (named subscriber), as outlined in our terms and conditions - https://www.infopro-insight.com/terms-conditions/insight-subscriptions/
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. Copying this content is for the sole use of the Authorised User (named subscriber), as outlined in our terms and conditions - https://www.infopro-insight.com/terms-conditions/insight-subscriptions/
If you would like to purchase additional rights please email info@waterstechnology.com
More on Emerging Technologies
Liquidnet sees electronic future for gray bond trading
TP Icap’s gray market bond trading unit has more than doubled transactions in the first quarter of 2024.
Verafin launches genAI copilot for fincrime investigators
Features include document summarization and improved research tools.
Waters Wrap: Open source and storm clouds on the horizon
Regulators and politicians in America and Europe are increasingly concerned about AI—and, by extension, open-source development. Anthony says there are real reasons for concern.
Waters Wavelength Podcast: Broadridge’s Joseph Lo on GPTs
Joseph Lo, head of enterprise platforms at Broadridge, joins the podcast to discuss AI tools.
Man Group CTO eyes ‘significant impact’ for genAI across the fund
Man Group’s Gary Collier discussed the potential merits of and use cases for generative AI across the business at an event in London hosted by Bloomberg.
BNY Mellon deploys Nvidia DGX SuperPOD, identifies hundreds of AI use cases
BNY Mellon says it is the first bank to deploy Nvidia’s AI datacenter infrastructure, as it joins an increasing number of Wall Street firms that are embracing AI technologies.
This Week: Linedata acquires DreamQuark, Tradeweb, Rimes, Genesis, and more
A summary of some of the latest financial technology news.
Systematic tools gain favor in fixed income
Automation is enabling systematic strategies in fixed income that were previously reserved for equities trading. The tech gap between the two may be closing, but differences remain.
Most read
- Chris Edmonds takes the reins at ICE Fixed Income and Data Services
- Deutsche Börse democratizes data with Marketplace offering
- Waters Wavelength Podcast: Broadridge’s Joseph Lo on GPTs