Summary: NIST weighs in on Big Data technology, standards, use cases, and a surprising variety of valuable documentation.
You can bet that the folks at DARPA and our other Federal forward thinkers had their eye on Big Data pretty much from its inception in about 2007. Say what you will about the Fed but those research dollars gave us the Internet, super computing, and a whole lot more.
The Big-Web-User community (Google, Facebook, Yahoo, Amazon, and their friends and competitors) are pumping big private dollars into the commercial applications that have derived from Big Data and NoSQL over the last six or seven years but the limitation of private money is that the payoff (monetizing the IP) has to be in sight. For the most part this means that private investment is going into mainstream ideas, implementations, and infrastructure.
So when the Fed says it wants to focus on Big Data what should we expect for our tax dollars? A couple of things. First we actually rely on government to foster those areas of technology that are perhaps not so immediate and where profit is a little too far out for VCs. This early stage R&D funding is what has kept the US out in front on a variety of tech innovations for almost 60 years. Second, the Fed can be a force in defining the direction of a technology. Not by regulation but by the allocation of resources and by directing the creation of programs at different agencies that interact with the industry, academia, and our overseas friends and foes.
In early 2012 the Obama administration released its first guiding document and announcement of financial support for Big Data, the “Big Data Research and Development Initiative” (see it here) covering this first wave of commitments.
Then in early 2013 the National Institute of Standards and Technology (NIST) which is part of the Department of Commerce was given the task of creating a Big Data Technology Roadmap.
This roadmap will define and prioritize requirements for interoperability, portability, reusability, and extensibility for big data analytic techniques and technology infrastructure in order to support secure and effective adoption of Big Data.
To accomplish this, NIST created the Public Working Group for Big Data. Under the leadership of Mr. Wo Chang, an experienced digital data advisor within NIST. Wo’s task was to create a broad-based working group from industry, government, and academia:
With the goal of developing a consensus definitions, taxonomies, secure reference architectures, and technology roadmap. The aim is to create vendor-neutral, technology and infrastructure agnostic deliverables to enable Big Data stakeholders to pick-and-choose best analytics tools for their processing and visualization requirements on the most suitable computing platforms and clusters while allowing value-added from Big Data service providers and flow of data between the stakeholders in a cohesive and secure manner.
So here’s where this gets personal. Seeing that the Working Group was accepting all comers I volunteered as a contributor on the Definitions and Taxonomies committees which are two of the seven areas into which the task was divided:
Several hundred participants volunteered most from industry and we met once a week via video teleconference to develop this content that currently totals over 500 pages.
My hat’s off to Wo and the seven volunteer committee chairs who brought all these thoughts together in coherent form and for their masterful skills in herding pussycats. Let me just say that there were many strong disagreements but a minimum of flame wars.
The final series of seven volumes is due to be published at the end of summer, just a few weeks away but if you readers want to get a look before anyone else, the final drafts are available for download here:
These volumes bring together points of view about BD that you won’t see anywhere else thanks to the extreme diversity of industry and academic contributors. In Volume 3 Use Case are 51 diverse implementation summaries.
These are actually pretty detailed but if you’d like to drill down into the really extensive documentation of the use cases, a separate document with that detail can be found here (use case detail).
If you want a single reference for the scope of Big Data as it exists today this set is tough to beat. Two additional updates are planned over the coming year or two. Beat the crowd and enjoy a sneak peek at these final drafts.
July 28, 2015
Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2015, all rights reserved.
About the author: Bill Vorhies is President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001. Bill is also Editorial Director for Data Science Central. He can be reached at:
The original article can be seen at: http://data-magnum.com/national-institute-of-standards-and-technology-takes-on-big-data/