Big-Time Data Terminology

The term "big data" is so broad that when commenting about data management and the industry, it's better to consider topics such as data quality, data consistency or deriving value from data – or at least discuss matters in those terms.
A presentation given this past week by Pierre Feligioni, head of real-time data strategy at S&P Capital IQ, defined "big data" as "actionable data," and sought to portray big data concerns as really being about four issues: integration, technology, content and scalability.
Integration, particularly the centralization of reference data, is the biggest challenge for managing big data, as Feligioni sees it. While structured data is already quite "normalized," unstructured data, which can include messaging, emails, blogs and Twitter feeds, needs to be normalized.
Unstructured data is fueling rapid exponential growth in data volumes, justifying the name "big data." Data volumes are counted in terabytes (1,000 gigabytes), or even petabytes (1,000 terabytes). When it comes to unstructured data at those levels, central repositories that can collect and normalize data – and coordinate it with structured data – are a must, Feligioni contends.
Technology and scalability the building blocks necessary to make such central repositories functional, as he describes it. Natural language processing and semantic data approaches are also being applied. "The biggest challenge is understanding documents and creating analytics on top of this content, for the capability to make a decision to buy or sell," says Feligioni.
Scalability makes it possible to process more and more information, and is achieved through new resources, such as cloud computing, which carry their own issues and require additional decisions [as described in my column two weeks ago, "Cloud Choices"].
Everything that Feligioni calls part of "big data" actually revolves around getting higher quality data by incorporating more sources and checking them against each other to keep that data consistent. It's also about creating new value from data that can be acted upon by trading and investment operations professionals.
So, whatever buzzwords one uses, whether "big data" or sub-categories under that umbrella, what they are really talking about is quality, consistency and value. Other terms just describe the means.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Demand for private markets data turns users into providers
Buy-side firms seeking standardized, user-friendly datasets are turning toward a new section of the alternatives market to get their fix—each other.
LSEG-AWS extend partnership, Deutsche Bank’s AI plans, GenAI (and regular AI) concerns, and more
The Waters Cooler: Nasdaq and MTFs bicker about data fees, Craig Donohue to take the reins at Cboe, and Clearwater closes its Beacon deal, in this week’s news roundup.
From server farms to actual farms, ‘reuse and recycle’ is a winning strategy
The IMD Wrap: Max looks at the innovative ways that capital markets are applying the principles of “reduce, reuse, and recycle” to promote efficiency and keep datacenters running.
Study: RAG-based LLMs less safe than non-RAG
Researchers at Bloomberg have found that retrieval-augmented generation is not as safe as once thought. As a result, they put forward a new taxonomy to help firms mitigate AI risk.
Friendly fire? Nasdaq squeezes MTF competitors with steep fee increase
The stock exchange almost tripled the prices of some datasets for multilateral trading facilities, with sources saying the move is the latest effort by exchanges to offset declining trading revenues.
Waters Wavelength Ep. 314: Capco’s Bertie Haskins
Bertie Haskins, executive director and head of data for Apac and Middle East at Capco, joins to discuss the challenges of commercializing data.
Nasdaq, AWS offer cloud exchange in a box for regional venues
The companies will leverage the experience gained from their relationship to provide an expanded range of services, including cloud and AI capabilities, to other market operators.
Bank of America reduces, reuses, and recycles tech for markets division
Voice of the CTO: When it comes to the old build, buy, or borrow debate, Ashok Krishnan and his team are increasingly leaning into repurposing tech that is tried and true.