Big Data, housed in new and disruptive technologies, is expected to account for more than 50 percent of the world’s data in the next five years, according to a a new study. While it offers huge and untapped value, the inevitable result is stress and strain on the world’s Interent infrastructure as companies seek to manage this explosion of information.
The new study, released jointly by Internet Research Group and Infineta Systems a provider of WAN optimization systems, examines how big data is affecting enterprise WAN (Wide Area Network) throughout the country.
Big Data – which is defined as datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze – is most often found in petabyte to exabyte size, and is unstructured, distributed and in flat schemas. As big data continues to grow, the industry anticipates both enormous change and untapped value for enterprises. According to Infineta’s report, most companies will adopt key Big Data technologies in the next year to 12-18 months.
All this data in need of capture, storage, processing and distribution has the potential to clog networks. About .5 Gbps of bandwidth is needed per petabye of Big Data under management by Hadoop, an open source platform for large-scale computing. The bandwidth demand can result in compromises in the latency, speed and reliability of the enterprise WAN.
Infineta is interested in this topic, as the privately-held company based in San Jose, California supplies products that support critical machine-scale workflows across the data center interconnect. However, the study findings highlight developing trends that are impacting the entire data center industry.
Key trends identified by Infineta include:
The report finds that organizations are deploying Hadoop clusters as a centralized service offering so that individual divisions don’t have to build and run their own, and that “bigger is better” when it comes to processing batch workloads.
This set up leads to Big Traffic – data movement between clusters, within a data center and between data centers. Data movement includes but is not limited to replication and synchronization, which will become especially important as Hadoop becomes a significant factor in enterprise storage. Big Traffic data movement services support Big Data analytics, regulatory compliance requirements, high availability services and security services.