Opening Cross: For Semi-Structured Signals, Mind Who You Mine

Here at Inside Market Data, we’re often asked to provide longer, more analytical articles in addition to our regular news. So while I’m pleased to tell readers that we’re going to start doing exactly that—as well as breaking more news online—it’s worth looking at what’s driving this demand.
There’s an assumption that the majority of news—at least, in the corporate space (i.e., financial results, corporate actions, and announcements that might impact a company’s share price)—is commoditized, and that the only differentiators between services are the speed with which a newswire can deliver stories, or the level of depth one can provide compared to another. This isn’t always true—there are plenty of true exclusives out there (and we try to make them a large proportion of our content), but the sheer volume of standard news overwhelms them.
Hence, traders and investors are placing more trust in analysis services. In the simplest sense, analytics comprise the charts and displays that turn raw, structured data into visual displays and make it easier to understand price movements, apply studies and spot trends. And now, increasingly, firms are applying analytical tools to unstructured content, to derive signals from news volume and sentiment just as they do from trading volumes and price momentum.
Aside from some basic, “semi-structured” content types, sources say these analytics are more likely to deliver “more thoughtful,” medium-term indicators than signals that can be used for low-latency trading, but represent higher-value opportunities for longer-term horizons.
However, these timeframes may narrow as more companies use social media as their primary means of communicating with consumers and investors, and the signal-to-noise ratio increases. This advent of Big Data in the form of Twitter, blogs and other social media—which the markets are attempting to tap into in search of any market-moving leading indicators—causes a new wave of challenges, which technology providers are jostling to address with new offerings: for example, GigaSpaces last week released XAP 9.0, a platform for firms to build their own big data analysis platforms, while Titan Trading Analytics added new graphics—including one that displays extremes of social media sentiment as a contrarian indicator—to its TickAnalyst platform, which uses historical analysis of behavioral research to derive trading indicators. But consistent analysis of internet news and social media can be difficult because of factors such as the lack of publishing standards and formats, among many other issues, says Rich Brown, head of quantitative and event-driven solutions at Thomson Reuters.
Steve Ellenberg—moderating a panel at last week’s North American Financial Information Summit—noted a key problem with basing decisions on “social” sources: “News feeds are structured and have authority. But there’s a very low entry point to some forms of unstructured data and social media,” he said.
That’s not to say there aren’t valid sources—you just have to mind whose stream you mine. And to address Brown’s point, are professional market commentators likely to mix emoticons, profanity and multiple explanation points, for example? Probably not, but the point here seems to be that this channel is something that will uncover market-moving news that people don’t realize the value of, because it is being reported in the personal, retail, social realm—such as when an individual in Abbottabad, Pakistan blogged about hearing helicopters, which turned out to be the assassination of Osama bin Laden.
In the same way that tools previously only available to professional traders are now open to a retail audience, techniques applied to monitoring institutional activity may benefit from being applied to herd-provoking activity among retail investors. Front-running client orders is illegal, but monitoring chatter to identify where the next batch of orders will come from is just making the most of the information available—which is what analytics are all about.
More on Data Management
As datacenter cooling issues rise, FPGAs could help
IMD Wrap: As temperatures are spiking, so too is demand for capacity related to AI applications. Max says FPGAs could help to ease the burden being forced on datacenters.
Bloomberg introduces geopolitical country-of-risk scores to terminal
Through a new partnership with Seerist, terminal users can now access risk data on seven million companies and 245 countries.
A network of Cusip workarounds keeps the retirement industry humming
Restrictive data licenses—the subject of an ongoing antitrust case against Cusip Global Services—are felt keenly in the retirement space, where an amalgam of identifiers meant to ensure licensing compliance create headaches for investment advisers and investors.
LLMs are making alternative datasets ‘fuzzy’
Waters Wrap: While large language models and generative/agentic AI offer an endless amount of opportunity, they are also exposing unforeseen risks and challenges.
Cloud offers promise for execs struggling with legacy tech
Tech execs from the buy side and vendor world are still grappling with how to handle legacy technology and where the cloud should step in.
Bloomberg expands user access to new AI document search tool
An evolution of previous AI-enabled features, the new capability allows users to search terminal content as well as their firm’s proprietary content by asking natural language questions.
CDOs must deliver short-term wins ‘that people give a crap about’
The IMD Wrap: Why bother having a CDO when so many firms replace them so often? Some say CDOs should stop focusing on perfection, and focus instead on immediate deliverables that demonstrate value to the broader business.
BNY standardizes internal controls around data, AI
The bank has rolled out an internal enterprise AI platform, invested in specialized infrastructure, and strengthened data quality over the last year.