Subscribe to DSC Newsletter

7 Common Biases That Skew Big Data Results

Summary:  Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.

This is a great article by Lisa Morgan originally published on InformationWeek.com.  See the original article here.

Here’s a quick synopsis.

Data-driven decision-making is considered a smart move, but it can be costly or dangerous when something that appears to be true is not actually true. Even with the best of intentions, some of the world's most famous companies are challenged by skewed results because the data is biased, or the humans collecting and analyzing data are biased, or both.

 Here we present seven types of cognitive and data bias that commonly challenge organizations' decision-making. Once you've reviewed these, tell us in the comments section below whether you've experienced any in your organization, and how that worked out for you.

  1. Confirmation Bias
  2. Selection Bias
  3. Outliers
  4. Simpson’s Paradox
  5. Over Fitting and Under Fitting
  6. Confounding Variables
  7. Non-normality:  The Bell Does Not Toll

 

About the Author: Lisa Morgan is a freelance writer who covers big data and BI for InformationWeek. She has contributed articles, reports, and other types of content to various publications and sites ranging from SD Times to the Economist Intelligent Unit. Frequent areas of coverage include big data, mobility, enterprise software, the cloud, software development, and emerging cultural issues affecting the C-suite.

Views: 3033

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Amit Verma on July 18, 2015 at 4:21am

Good Article.

Comment by Nicolas Kiely on July 17, 2015 at 1:31pm

Good article. And it shows why effective communication is so critical to data science. Just an improper choice of wording some otherwise good conclusions can imply not so good assumptions. Even for the wary analyst crossing every T and dotting every I, the wrong executive summary can lead decision makers to unsatisfactory conclusions. For instance, wording a conclusion "Most of the time we found an increase in {stimulus p} was followed by a strong decrease in {signal q} within a 5 day window" may lead the unaware reader to assume p influences q, even though the analyst never meant to imply such a relationship. Discussions of leading indicators is not uncommon.

Weighing these considerations against the drive to keep such reports concise and digestible is not often an easy task. Consumers of analytical summaries are mostly concerned about the "action" parts of actionable insight, and not so much of the drier details. But data science wouldn't be much fun if it was too easy.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service