.

The ‘Bell curve’ or the ‘Gaussian bell curve’ is one of the fundamental concepts on which most of the statistical analysis is based. From social sciences to astronomy to financial services- most of the application of statistics in the real world relies on the assumption that the data being analysed is distributed in the shape of the bell curve.

**What does the bell curve mean?**

The bell curve, named after Herr Professor Doktor Gauss, is a beautiful visual that depicts how data from any normal distribution would behave.

In simplest terms, the Gaussian bell curve reveals that in a normal distribution most observations hover around the medium or the mediocre, i.e. the average. And the odds of deviation from this average (or chances of a value being different from the medium)decline at an increasingly faster rate (exponentially) as we move away from the average.

Let us take a simple example* to understand this feature of a normal distribution.

(* - This example has been taken from the book ‘The Black Swan’ by Nicholas Nassim Taleb. As a matter of fact, this whole article is inspired by Dr. Taleb’s brilliant writings on this subject.)

Assume that the average height of humans is 167 cm or 5 feet, 7 inches. Also assume that the unit of deviation (generally taken as the standard deviation of the population) is 10 cm.

Now as per the rules of the bell curve or the feature of the normal distribution, if one were to look at a (large enough) randomly chosen sub-population of humans, one would find most people to be of a height close to the average ie. 167 cm.

Put another way, more people are likely to be of height 168 cm (1 cm away from the average) than say 178 cm (11 cm away from the average).

And the odds of finding someone much taller (or shorter) than the average decrease at a faster and faster rate.

The odds of finding someone more than 10 cm taller than the average i.e. taller than 177 cm is **1 in 6.3.**

The odds of finding someone more than 20 cm taller than the average i.e. taller than 187 cm or 6 ft 2 inches is **1 in 44.**

The odds of finding someone more than 60 cm taller than the average i.e. taller than 227 cm or 7 ft 5 inches is **1 in a billion.**

The odds of finding someone more than 70 cm taller than the average i.e. taller than 237 cm is **1 in 780 billion.**

The main point to understand here is the pace at which the odds decline as we look for more and more abnormal or unusual observations. For the 10 cm increase in height from 177 cm to 187 cm, the odds change from 1 in 6.3 to 1 in 44. But for a 10 cm increase in height from 227 cm to 237 cm the odds change from 1 in a billion to 1 in 780 billion.

This is the essential property of a bell curve. The odds of finding larger and larger observations become so small that the outliers or unusual occurrences become a very, very remote possibility and hence can be ignored for all practical purposes.

This is the boon of the bell curve. It allows us to focus on the mediocre or the ordinary, and ignore the rare or the barely possible.

This is why statisticians, academicians, analysts and all sorts of people love the bell curve. It allows them to focus on the usual, the frequent or the norm.

Statistical models, from simple regression models to complex ones like the Black Scholes model in finance, are based on this property of the bell curve.

It is this property that allows us to say that it is highly improbable to see someone who is over 8 feet tall. Or make even more precise predictions such as – 68% of a large randomly selected population is going to be within 157 to 177 cm in height. And many such declarations that you regularly encounter in daily life – from medical test results to exit polls.

So far we have talked about how the bell curve is a boon for statistical analysis – it helps us simplify things and use rules to understand distributions. The curve’s symmetry and consistency make it ideal for making predictions. This is why it is such an important concept in business analytics.

In the next article we will actually address the main topic i.e. how these same qualities of the bell curve that make it so tempting and useful are also a curse. We will understand how uninformed application of the bell curve can lead to serious errors and can cause more harm than good. We will also see how mis-use of the bell curve is a lot more rampant than we think.

**Author bio**

Gaurav Vohra is an alumnus of IIM Bangalore with over 10 years of experience in the field of analytics. Gaurav has been in the analytics industry from its initial days and his career has spanned companies like Capital One and Information resources Inc., recognized as thought-leaders in the analytics space.

Gaurav is now the co-founder of **Jigsaw academy (www.jigsawacademy.com)**, a training institute that aims to meet the growing demand for talent in the field of analytics by providing industry-relevant training to develop business-ready professionals. You can visit Gaurav’s blog at **www.analyticstraining.com**

- 11 data science skills for machine learning and AI
- Get started on AWS with this developer tutorial for beginners
- Microsoft, Zoom gain UCaaS market share as Cisco loses
- Develop 5G ecosystems for connectivity in the remote work era
- Choose between Microsoft Teams vs. Zoom for conference needs
- How to prepare networks for the return to office
- Qlik keeps focus on real-time, actionable analytics
- Data scientist job outlook in post-pandemic world
- 10 big data challenges and how to address them
- 6 essential big data best practices for businesses
- Hadoop vs. Spark: Comparing the two big data frameworks
- With accelerated digital transformation, less is more
- 4 IoT connectivity challenges and strategies to tackle them

Posted 10 May 2021

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central