At this year's Waters USA event, Dun & Bradstreet's chief data scientist, Anthony Scriffignano, implored the audience to retool the way they think about data, or risk drowning in a sea of information.
In Lewis Carroll's classic novel "Through the Looking-Glass," Alice finds herself running as hard as she can only to stay in the same spot. Alice explains to the Red Queen that in her country, if you "run very fast for a long time" then "you'd generally get to somewhere else."
The Red Queen scoffed, "A slow sort of country! Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
Carroll, in addition to being many other things, was a logician and the parable above, in computer science, has come to be known as a Red Queen problem—and Anthony Scriffignano believes that the finance sector is currently living in a Red Queen problem. Firms are drowning in a sea of data and they're pushing as hard as they can to stay above the deluge, throwing money and manpower at the problem, but in the end all they're doing is staying in the same place.
"We are living in a Red Queen problem right now, and the only way out of a Red Queen problem is not to run faster or turn the crank harder, but do something orthogonal to what you've been doing all along—something completely different. That's what we have to do in this environment," said Scriffignano, who gave a presentation at this year's Waters USA conference.
The Black Plague of Information
Scriffignano, chief data scientist at information consultancy Dun & Bradstreet, estimates that about 85 percent of the data that's being created is unstructured. As datasets get bigger and bigger, users are awash in information and tend to fall back into a role where, in order to maintain a semblance of sanity, they focus in on their ontology—their business space—and shut out the rest.
The problem is that they're looking at data at face value and not establishing relationships. As a result, the value is generally lost. Additionally, as new data sources pop up, users can try and vet those streams of information by testing a stratified representative of that data, but they end up losing provenance and the reasons for how they got to where they now are with that information. And after they're done testing, 10 new sources have already popped up.
"If we're not paying attention, it's very easy to get washed over by this; I can argue that as a human race, to some extent, that's happening right now," Scriffignano said. "If you look at the human race and how we have dealt with hypergeometric growth of information, you can't find any examples because it hasn't happened. But we do have some examples of dealing with hypergeometric growth, such as the Black Plague—our response has been primarily to die off until we figure it out. That's probably not a good response."
A Sea of Potential
Scriffignano said that eventually, analytics platforms will be able to prepare for unplanned events like natural disasters or political unrest, but we're not there yet. Even still, he did provide some hypothetical examples as to how firms can gain value from largely unstructured data. He also talked about the potential dangers of bad actors turning technology against us.
Take, for example, the maligned Hello Barbie doll. Barbie is an iconic toy, but this iteration is connected to the internet. The Mattel-made doll is designed to help children with their verbal skills by allowing users to have actual conversations with the doll, where the child can speak into the doll's microphone, the doll then runs those words through a server and spits back out an appropriate response in real-time. But security firm Bluebox Labs has found major vulnerabilities with the device.
"It's kind of a cool idea until you think about the fact that you have these kids running around with a device, connected to wifi, connected to the internet, with a microphone on it, that they may or may not leave in the home office of their parents, that may or may not be turned on—at any time—to record anything. Didn't see that one coming," he said, adding one more warning: "And don't talk in front of your television, by the way."
Evolve or Die
His conclusion was that today's environment requires new skills and new ways of thinking, because you can't escape the fact that bad guys are innovating faster than the good guys. There's also a new reality that truth is fungible as unstructured data shows only 15 percent of a picture. And most striking, computable data—which is data that can be consumed by algorithms—doesn't even look remotely like what it looked like just two years ago, Scriffignano said.
The inconvenient truth, he concludes, is that more data is not necessarily better data. With the data we have, we have to think about it differently than what we did just a decade ago.
"This is the way the new stuff is going to look," he said. "Data you collected yesterday is not one day old, it's data you collected yesterday, but we behave as though it's one day old."
Jesse Lund talks about real uses for DLT in the capital markets, lessons learned while rolling out IBM's blockchain platform, and what’s ahead for 2018, and into 2019.Subscribe to Weekly Wrap emails