Subscribe to DSC Newsletter

National Institute of Standards and Technology Takes on Big Data

Summary:  NIST weighs in on Big Data technology, standards, use cases, and a surprising variety of valuable documentation.

You can bet that the folks at DARPA and our other Federal forward thinkers had their eye on Big Data pretty much from its inception in about 2007.  Say what you will about the Fed but those research dollars gave us the Internet, super computing, and a whole lot more. 

The Big-Web-User community (Google, Facebook, Yahoo, Amazon, and their friends and competitors) are pumping big private dollars into the commercial applications that have derived from Big Data and NoSQL over the last six or seven years but the limitation of private money is that the payoff (monetizing the IP) has to be in sight.  For the most part this means that private investment is going into mainstream ideas, implementations, and infrastructure.

So when the Fed says it wants to focus on Big Data what should we expect for our tax dollars?  A couple of things.  First we actually rely on government to foster those areas of technology that are perhaps not so immediate and where profit is a little too far out for VCs.  This early stage R&D funding is what has kept the US out in front on a variety of tech innovations for almost 60 years.  Second, the Fed can be a force in defining the direction of a technology.  Not by regulation but by the allocation of resources and by directing the creation of programs at different agencies that interact with the industry, academia, and our overseas friends and foes.

In early 2012 the Obama administration released its first guiding document and announcement of financial support for Big Data, the “Big Data Research and Development Initiative” (see it here) covering this first wave of commitments.

  • National Science Foundation and the National Institutes of Health – Core Techniques and Technologies for Advancing Big Data Science & Engineering
  • Department of Defense – Data to Decisions
  • Defense Advanced Research Projects Agency (DARPA) XDATA program
  • National Institutes of Health – 1,000 Genomes Project Data Available on Cloud:
  • Department of Energy – Scientific Discovery Through Advanced Computing
  • US Geological Survey – Big Data for Earth System Science

Then in early 2013 the National Institute of Standards and Technology (NIST) which is part of the Department of Commerce was given the task of creating a Big Data Technology Roadmap.

This roadmap will define and prioritize requirements for interoperability, portability, reusability, and extensibility for big data analytic techniques and technology infrastructure in order to support secure and effective adoption of Big Data.

To accomplish this, NIST created the Public Working Group for Big Data. Under the leadership of Mr. Wo Chang, an experienced digital data advisor within NIST.  Wo’s task was to create a broad-based working group from industry, government, and academia:

With the goal of developing a consensus definitions, taxonomies, secure reference architectures, and technology roadmap. The aim is to create vendor-neutral, technology and infrastructure agnostic deliverables to enable Big Data stakeholders to pick-and-choose best analytics tools for their processing and visualization requirements on the most suitable computing platforms and clusters while allowing value-added from Big Data service providers and flow of data between the stakeholders in a cohesive and secure manner.

So here’s where this gets personal.  Seeing that the Working Group was accepting all comers I volunteered as a contributor on the Definitions and Taxonomies committees which are two of the seven areas into which the task was divided:

  1. Big Data Definitions
  2. Big Data Taxonomies
  3. Big Data Use Cases and General Requirements
  4. Big Data Security and Privacy Requirements
  5. Big Data Security and Privacy Reference Architectures
  6. Big Data Reference Architectures
  7. Big Data Standards Roadmap

Several hundred participants volunteered most from industry and we met once a week via video teleconference to develop this content that currently totals over 500 pages.

My hat’s off to Wo and the seven volunteer committee chairs who brought all these thoughts together in coherent form and for their masterful skills in herding pussycats.  Let me just say that there were many strong disagreements but a minimum of flame wars.

The final series of seven volumes is due to be published at the end of summer, just a few weeks away but if you readers want to get a look before anyone else, the final drafts are available for download here

M0392 | PDFDraft SP 1500-1 -- Volume 1: Definitions

M0393 | PDFDraft SP 1500-2 -- Volume 2: Taxonomies

M0394 | PDFDraft SP 1500-3 -- Volume 3: Use Case & Requirements

M0395 | PDFDraft SP 1500-4 -- Volume 4: Security and Privacy

M0396 | PDFDraft SP 1500-5 -- Volume 5: Architectures White Paper Survey

M0397 | PDFDraft SP 1500-6 -- Volume 6: Reference Architecture

M0398 | PDFDraft SP 1500-7 -- Volume 7: Standards Roadmap

These volumes bring together points of view about BD that you won’t see anywhere else thanks to the extreme diversity of industry and academic contributors.  In Volume 3 Use Case are 51 diverse implementation summaries.

  • Government Operations (4)
  • Commercial (8)
  • Defense (3)
  • Healthcare and Life Sciences (10)
  • Deep Learning and Social Media (6)
  • The Ecosystem for Research (4)
  • Astronomy and Physics (5)
  • Earth, Environmental and Polar Science (10)
  • Energy (1)

These are actually pretty detailed but if you’d like to drill down into the really extensive documentation of the use cases, a separate document with that detail can be found here (use case detail).

If you want a single reference for the scope of Big Data as it exists today this set is tough to beat.  Two additional updates are planned over the coming year or two.  Beat the crowd and enjoy a sneak peek at these final drafts.

 

July 28, 2015

Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2015, all rights reserved.

 

About the author:  Bill Vorhies is President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001.  Bill is also Editorial Director for Data Science Central.  He can be reached at:

[email protected]

[email protected]

The original article can be seen at:  http://data-magnum.com/national-institute-of-standards-and-technology-takes-on-big-data/

 

Views: 1503

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Hieu Tran on December 24, 2015 at 7:51am

Brilliant article. Thank you for sharing your insight. Looking forward to further updates on this effort.

Comment by William Vorhies on August 5, 2015 at 11:56am

Oleg:

I'm not aware of any international dimension since this was a US Commerce Department activity.  I imagine you could use this material as a starting point for further development in any country.

Comment by Oleg Okun on August 1, 2015 at 10:57am

Very much needed work! Thanks a lot, Bill! Correct me if I am wrong, but drafts were composed by US contributors, right? I am interested to know who is a contact person for Germany.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service