Are we on the brink of another Netscape moment? Back in 1995, Netscape’s IPO started inflating the dotcom bubble. Before Netscape, for all but a small cadre of programmers, computers were glorified word processors. Sure, some people used them for bookkeeping or to play solitaire, but word processing was the bread and butter for general users.
Then along came Netscape, sparking an Internet gold rush and turning San Francisco, once again, into a get-rich-quick boomtown. The parallels between Big Data and the dotcom era aren’t perfect. As Mark Twain said, “History doesn’t repeat itself, but it does rhyme.”
So, sure, there is no Netscape-equivalent startup in Big Data (although VC money is pouring into Big Data startups). And whereas the birth of the Web felt like we went from nothing to something overnight (even if that wasn’t really true), Big Data already has well-known ancestors, such as data mining, distributed computing, Business Intelligence (BI), etc.
One big parallel is certainly rhyming, if not repeating, however: democratization. The democratization of data will change how businesses decisions are made, who makes them, and who gains (and loses) power from this shift.
What will it mean to democratize data?
Despite the fact that telcos in the U.S. are exerting monopoly power over Internet access, the Internet was certainly a force that democratized information.
In the browser era, people could now read newspapers from all over the world without leaving the house. Information previously locked in arcane journals became available through a simple Web search. And e-commerce sites started to displace retailers, allowing people to hunt low prices in many places other than Walmart.
Granted, much of this democratization resulted in next to nothing. Amazon is the behemoth that Walmart used to be, while Walmart, when not fending off bad PR, is investing heavily in Big Data to bolster its online presence.
Similarly, I can certainly read The Times of London if I want to, but do I?
In fact, one-third of adults in the U.S. now gets news from one source: Facebook.
What this means is that rather than being inundated with news that may be completely foreign to you and may cause you to rethink your worldview, now most of your news is delivered by sites that you have “liked,” and most of it just reinforces what you already believe.
In fact, as the Web moves further and further down the personalization rabbit hole, many of the Big Data insights that companies like Facebook and Google have made mean that we see less and less of the Web each day. Instead, searches, social media streams, and even news sites now serve up things that we’ve expressed an interest in before.
The whole, wide, wooly Web is still out there. You just never see it.
Will this same sort of phenomena happen with Big Data? Will data democratization just give business decision makers more ammo for what they’ve already decided?
David Chaiken, CTO of Big Data infrastructure startup Altiscale, doesn’t think so. Chaiken worked at Yahoo! when the Big Data platform Hadoop was originally developed.
“When I started at Yahoo!, stacks of data were locked away in many different parts of company. Often, access to that data meant getting on a plane and flying to a different site,” he said. “The promise of Big Data is that you can now break down those siloes and democratize the access to data.”
It’s been a goal of IT to break down data siloes for ages, but then what? Just because you can see the data doesn’t mean you can do anything with it. But to Chaiken that doesn’t matter. The simple act of opening up data is huge, he believes.
Chaiken invokes Metcalfe’s Law to drive this point home. “Simply unlocking data creates a network effect for that data. You can know take some anonymous customer data, open it up, discover trends, and all sorts of networking effects start to happen across the organization.”
Chaiken gave the example of a water utility discovering a leak just because a usage spike showed up on some customer’s bill.
From the known unknowns to unknown unknowns
Examples such as the water utility one represent the low-hanging fruit of Big Data, and that’s probably what we’ll see for the next few years. In other words, water utilities have always wanted to spot leaks early, and someone in billing could have had an a-ha moment, realizing that if you compared billing stats with past service calls and found a high degree of correlation, you may be onto something predictive.
Before Big Data came along, though, that billing person could have still done the analysis, but it would have taken so long that most people would give up before even trying.
There’s certainly value in Big Data insights that are of the obvious variety, but where Big Data gets interesting is when it surfaces strange ideas that would have occurred to exactly no one if data patterns didn’t point the way.
For instance, predictive analytics startup Kaggle found that if you’re going to buy a used car, you should get an orange one. Why? This is just Kaggle’s educated guess, but they believe that people buying orange cars are doing so as a means of self-expression, and, therefore, they tend to take better care of their cars. So I guess orange is now the opposite of a lemon.
And the “why” doesn’t matter. If the orange data holds up over time, you’ll want to buy orange used cars, whether or not the why is understood.