Subscribe to DSC Newsletter

There has been enough buzz about Big Data over the past year or two to create a very big data file.  It seems that if you do not have a strategy around big data, mining ecommerce and social media sites for new insights into your customers, using new data management software tools like Hadoop, you are at a serious disadvantage to your competitors who are all jumping on the big data bandwagon.

 

And - there is merit to this line of thought for some companies.  The increase in the volume, variety and velocity of data flows has increased exponentially with the growth in online activity.  There are more types of different data being generated at an increasing pace and companies need to understand how best to manage and leverage that information to the benefit of their customers, prospects and their own bottom line.  IBM also includes data veracity as one of the four V’s of Big Data referencing the lack of trust management has in much of the data.

 

 

 

All that said - let’s not forget about small data.  Former McKinsey consultant Allen Bonde has noted that big data is about machines and small data is about people.

For many companies, email marketing reports, google analytics and other website analytics can be readily used in addition to internally generated transactional reports.  This data tends to be activity-oriented data, locally-sourced, easily accessible and can be used to deliver immediate results.  

 

Companies can mine small data far more easily to generate insights to drive marketing initiatives including cross-sell and upsell opportunities, marketing spend analysis and the like.  However, as with any data analysis project it is important to:

  • define the business problem being addressed
  • identify the data sources to be utilized
  • ensure the integrity of the data 

 

Given the current hype around analytics, there is a risk that business users may jump right in expecting their software tools will automatically identify the right questions and the answers.  In fact, some of the cloud based services, such as those from IBM’s Watson Analytics and SAS do a great job at this as well as more visualization oriented tools such as Tableau.  However, companies will yield the greatest ROI by ensuring that they take a rigorous approach to what they want to achieve from their data analysis project.

 

Once this is done, the business will be in a better position to identify the best data sources to achieve their objective.  Companies may find that internally generated spreadsheets are useful, transactional reports from ERP systems such as Salesforce or their accounting systems, as well undertaking customer surveys or aggregating other third party data can quickly and cheaply help improve the quality of data being analyzed.

 

The heavy lifting of data cleansing then needs to be completed to ensure that missing, incomplete, inaccurate or duplicate records are identified and treated appropriately.

 

Once done, the company will be in a position to start generating insights specific to the business problem being addressed and quickly implement processes to drive an improved ROI.  And, who knows, maybe then they will have the experience, confidence and budget to move on to Big Data!

 

 

 

 

Views: 4321

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by John Miraglia on October 7, 2015 at 3:08am

Well stated Greg. I've run "small" data projects for candidate assessment and found actionable results. If I'd waited for the green light and budget for a big data investment I'd have gotten no where. Small data projects are the bread and butter of running a skunk works.      

Comment by Sione Palu on October 1, 2015 at 12:20pm

Data is just data, whether small, big, dark, white or what have you.  The algorithm is ignorant of where the data comes from,  whether it is  user-by-movie matrix for  Netflix rating or  word-by-document matrix from a corpus of legal documents.  The SVD, NMF, LDA, etc,..., still see a matrix of numbers, irrelevant of where the numerical matrix originated, be it big, small, dark or white data.

Comment by Ahmed Shibrain on September 29, 2015 at 1:16pm

Yes, small data matters, and even tiny data.

I think insight is not about how big our data is, but rather about how valuable!

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service