Smart Big Data: The All-Important 90/10 Rule

The sheer volumes involved with Big Data can sometimes be staggering. So if you want to get value from the time and money you put into a data analysis project, a structured and strategic approach is very important.

The phenomenon of Big Data is giving us ever-growing volume and variety of data we which we can now store and analyze. Any regular reader of my posts knows that I personally prefer to focus on Smart Data, rather than Big Data - because the term places too much importance on the size of the data. The real potential for revolutionary change comes from the ability to manipulate, analyze and interpret new data types in ever-more sophisticated ways.

Application of the Pareto distribution and 90/10 rule in a related context

The SMART Data Framework

I’ve written previously about my SMART Data framework which outlines a step-by-step approach to delivering data-driven insights and improved business performance.

  1. Start with strategy: Formulate a plan – based on the needs of your business
  2. Measure metrics and data: Collect and store the information you need
  3. Apply analytics: Interrogate the data for insights and build models to test theories
  4. Report results: Present the findings of your analysis in a way that the people who will put them into effect will understand
  5. Transform your business

Understand your customers better, optimize business processes, improve staff wellbeing or increase revenues and profits.

My work involves helping businesses use data to drive business value. Because of this I get to see a lot of half-finished data projects, mothballed when it was decided that external help was needed.

The biggest mistake by far is putting insufficient thought – or neglecting to put any thought – into a structured strategic approach to big data projects. Instead of starting with strategy, too many companies start with the data. They start frantically measuring and recording everything they can in the belief that big data is all about size. Then they get lost in the colossal mishmash of everything they’ve collected, with little idea of how to go about mining the all-important insights.

This is why I have come up with the 90/10 rule – When working with data, 90% of your time should be spent on a structured strategic approach, while 10% of your time should be spent “exploring” the data.

The 90/10 Rule

The 90% structured time should be used putting the steps outlined in the SMART Data framework into operation. Making a logical progression through an ordered set of steps with a defined beginning (a problem you need to solve), middle (a process) and an ending (answers or results).

This is after all why we call it Data Science. Business data projects are very much like scientific experiments, where we run simulations testing the validity of theories and hypothesis, to produce quantifiable results. 

The other 10% of your time can be spent freely playing with your data – mining for patterns and insights which, while they may be valuable in other ways, are not an integral part of your SMART Data strategy.

Yes, you can be really lucky and your data exploration can deliver valuable insights – and who knows what you might find, or what inspiration may come to you? But it should always play second-fiddle to following the structure of your data project in a methodical and comprehensive way.

Always start with strategy

I think this is a very important point to make, because it’s something I often see companies get the wrong way round. Too often, the data is taken as the starting point, rather than the strategy.

Businesses that do this run the very real risk of becoming “data rich and insight poor”. They are in danger of missing out on the hugely exciting benefits that a properly implemented and structured data-driven initiative can bring.

Working in a structured way means “Starting with strategy”, which means identifying a clear business need and what data you will need to solve it. Businesses that do this, and follow it through in a methodical way will win the race to unearth the most valuable and game-changing insights.

I hope you found this post interesting. I am always keen to hear your views on the topic and invite you to comment with any thoughts you might have.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Bette... You can read a free sample chapter here

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 9807


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Steven Rosen on April 8, 2015 at 8:26am

I cannot agree with you more. It would seem obvious to "start with strategy", but this concept is lost upon many data analysts. Furthermore, it is also the reason why executive stakeholders wonder, after making significant investments in data collection and storage, why their organization has presented them with so few useful insights. 

Comment by Randall Shane on March 20, 2015 at 12:06pm

I particularly like the phrase Smart Data, over Big Data.  If we are truly talking about the three V's, then Big Data is fine I guess, but it appears that most people confuse Big Data with analytics, predictive analysis, or any other type of data related subject. Yes, you can do these things with large volumes of data, but you can also do those things with smaller data sets. I guess my point is that if one phrase was going to be used to encompass all things data, Smart Data is a better terminology. 

Comment by Nagaraj Kulkarni on March 19, 2015 at 6:57pm

“Starting with strategy”- I fully endorse the smart move.

Comment by Oojwal Manglik on March 18, 2015 at 9:57pm

I totally agree with your observation that majority of data projects end up being exploratory in nature rather than adding tangible value to business. The key issue is identifying the right problems to solve. Everything else from what kind of data to analyse to what kind of tools to deploy should follow. Unfortunately in many cases the process starts with data not with the problem.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service