Big Data is a hot topic. It touches almost every aspect of the IT landscape. But it’s an umbrella term that means different things to different people. To get more concrete insights, I recently read a focused research program involving over 400 corporate IT decision makers who handle initiatives related to Big Data. Here are some of the high points from what they found.
Many different stakeholders have pain and budget tied to Big Data.
Certainly it means lots of data…but more importantly it has implications for IT decision makers concerned with storage, security, network capacity, virtualization decisions, analytics to evaluate business performance, applications that leverage the data, and infrastructure like databases, programming tools, testing environments, and more.
The IT professionals surveyed were decision-makers in the areas of database software, database hardware, data analytics, big data programming, data storage, and data visualization. Many were also influential in decisions affecting areas in which they did not directly make decisions.
Data volumes are big and growing rapidly–even for mid-size companies.
Data volumesare massive and growing larger every year, putting strain on infrastructure, processes, and people required to analyze the torrent of information.
About 22 percent of respondents said their company’s data volume was growing by between 20 and 50 percent a year; 28 percent said it was growing by 11 to 20 percent annually, and 31 percent reported annual growth of between 6 and 10 percent.
A fifth of respondents also reported their company’s data volume had already surpassed 5 petabytes, though the largest segment of respondents— 42 percent — reported volumes between 25tb and 1 pb.
Interestingly, the trend doesn’t just apply to big companies, whom we expected to have a lot of data. The fastest average data growth was among firms with 250M-1B in revenue.
Investment is active because managers have pain around cost, security, bandwidth, & people.
64 percent of respondents reported their companies were either currently evaluating big data investments or evaluating for investment in the next 6-12 months. Another 11 percent reported their companies would like to invest in big data solutions but were not currently budgeted to do so.
Why now? They’re in pain!
On a scale of one to nine, many reported infrastructure costs were approaching a level that was at or approaching a nine, or “excruciatingly painful.”
The most prevalent areas of pain are around data complexity, data access and data infrastructure cost.
The pain is often driven by big data, but focused in areas like infrastructure and database systems that are not currently built to handle the need.
Forty-seven percent of respondents, for example, reported pain that was a seven or greater resulting from too many different sources of critical data , including 10 percent who rated it “excruciatingly painful.”46 percent cited pain of seven or above because of “lack of a real-time view into critical data,” and 41 percent cited pain in “lack of integration of critical data.” Also telling: broader pain indicated by more than 44 percent of respondents, who said there were significant problems with “increasing infrastructure” costs overall.
It’s an early, fragmented market for vendors.
While more than 50 percent of respondents expressed interest in or active plans to explore alternatives, 76 percent said their company was still using a relational database system for its big data needs, dwarfing the responses for Hadoop-based stores (13 percent) and NoSQL databases such as MongoDB (4 percent). 7 percent reported using a different solution or combination, such as Amazon, Neteeza, Isilon, and unstructured parallel file systems.
A majority of respondents saw relational systems as high-cost relative to Hadoop or other big-data architectures, though both are understood as complex and requiring highly skilled personnel.
53 percent reported their big-data deployment was in-house, in a purpose-built appliance. 25 percent said it was an on-premises, roll-your-own appliance, and 18 percent reported their big data system was housed in a private cloud. Only 1 percent reported using a public cloud for big data.
As for where all this data is coming from, 89 percent said the source was enterprise applications. 40 percent indicated data was coming from other machine-generated sources like web monitors and sensor logs, and 25 percent said data came from human-generated text such as social media.
To analyze this data, traditional BI solutions are often being applied, such as those from Business Objects, Cognos, Oracle, and MicroStrategy, but increasingly new alternatives are being explored to address different aspects of the problem.