There's no doubt about it, probability and statistics is an *enormous* field, encompassing topics from the familiar (like the average) to the complex (…

Added by Stephanie Glen on December 9, 2019 at 7:48am — No Comments

At the time of writing, I'm a 52 year-old working in the fields of mathematics and data science. In mathematics, that makes me well-seasoned (and probably well-tenured, if I had chosen to continue in academia). In data science, some would consider me a dinosaur. In fact, many older people considering a career in data science might be put off by the thought that data science is tough to break into at a later age. But is that statement true? Should the over 50 crowd put down their textbooks…

ContinueAdded by Stephanie Glen on November 30, 2019 at 8:30am — 6 Comments

The standard error is really just a type of standard deviation. For this simple example, I've used three samples as an illustration of how the standard deviation and standard error differ as they relate to…

ContinueAdded by Stephanie Glen on November 25, 2019 at 1:21pm — No Comments

This picture compares two different types of bootstrap (parametric and non-parametric) with their traditional counterpart.

Which Bootstrap When?…

Continue

Added by Stephanie Glen on November 14, 2019 at 6:56pm — No Comments

Resampling is a way to reuse data to generate new, hypothetical samples (called *resamples*) that are representative of an underlying population. It's used when:

- You don't know the underlying distribution for the population,
- Traditional formulas are difficult or impossible to apply,
- As a substitute for traditional…

Added by Stephanie Glen on November 6, 2019 at 9:17am — No Comments

Unsupervised learning algorithms are "unsupervised" because you let them run without direct supervision. You feed the data into the algorithm, and the algorithm figures out the patterns. The following picture shows the differences between three of the most popular unsupervised learning algorithms: Principal Component Analysis**,…**

Added by Stephanie Glen on October 31, 2019 at 7:43am — No Comments

Which statistical method you use to compare data sets depends on two main factors: your overall goal and the type of data you have. Parametric data means that you know the underlying distribution (for example, your data might be normally distributed).…

ContinueAdded by Stephanie Glen on October 26, 2019 at 8:45am — No Comments

P-values ("Probability values") are one way to test if the result from an experiment is statistically significant. This picture is a visual aid to p-values, using a theoretical experiment for a pizza business.…

ContinueAdded by Stephanie Glen on October 18, 2019 at 8:47am — 6 Comments

Correlation and regression analysis both deal with relationships between variables. There are many different types of correlation and regression; This image focuses on the differences between the two most common ones: Pearson correlation and…

ContinueAdded by Stephanie Glen on October 10, 2019 at 12:01pm — No Comments

You may have figured out already that statistics isn't exactly a science. Lots of terms are open to interpretation, and sometimes there are many words that mean the same thing—like "mean" and "average"—or *sound* like they should mean the same thing, like s*ignificance level* and *confidence level. *

Although they sound very similar, significance level and confidence level are in fact two completely different concepts. Confidence levels and *confidence…*

Added by Stephanie Glen on September 30, 2019 at 12:00pm — No Comments

Correlation coefficients enable to you find relationships between a wide variety of data. However, the sheer number of options can be overwhelming. This picture sums up the differences between five of the most popular correlation coefficients.…

ContinueAdded by Stephanie Glen on September 22, 2019 at 6:03am — No Comments

The main **difference between stratified sampling and cluster sampling** is that with cluster sampling, you have natural groups separating your population. For example, you might be able to divide your data into natural groupings like city blocks, voting districts…

Added by Stephanie Glen on September 14, 2019 at 5:28am — No Comments

Statistics, Statistical Learning, and Machine Learning are three different areas with a large amount of overlap. Despite that overlap, they are distinct fields in their own right. The following picture illustrates the difference between the three fields.

Added by Stephanie Glen on September 6, 2019 at 8:47am — 1 Comment

ANOVA is a test to see if there are differences between groups. Put simply, "One-way" or "two-way" refers to the number of independent variables (IVs) in your test. However, there are other subtle differences between the tests, and the more general factorial ANOVA. This picture sums up the differences.…

ContinueAdded by Stephanie Glen on August 27, 2019 at 9:21am — No Comments

This simple picture shows the differences between descriptive statistics and Inferential statistics.

Added by Stephanie Glen on August 24, 2019 at 7:32am — No Comments

The following picture shows the differences between the Z Test and T Test. Not sure which one to use? Find out more here:…

ContinueAdded by Stephanie Glen on August 13, 2019 at 10:30am — No Comments

Like many emergency rooms in the United Kingdom, the A&E department at Salford Royal NHS Foundation Trust, Greater Manchester, faces high congestion. This results in treatment delays and access issues. The Data Science team at the Northern Care Alliance (NCA) National Health Service (NHS) Group of hospitals is implementing support mechanisms to **ease wait times**, using machine learning and regression to…

Added by Stephanie Glen on August 5, 2019 at 5:29am — 1 Comment

The basic idea behind regression analysis is to take a set of data and use that data to make predictions. A useful first step is to make a scatter plot to see the rough shape of your data.…

ContinueAdded by Stephanie Glen on July 31, 2019 at 4:00am — No Comments

Decision Trees, Random Forests and Boosting are among the…

ContinueAdded by Stephanie Glen on July 28, 2019 at 7:30am — 1 Comment

In my previous posts, I compared model evaluation techniques using Statistical Tools & Tests and commonly used Classification and Clustering evaluation techniques

In this post, I'll take a look at how you can compare regression models. Comparing…

ContinueAdded by Stephanie Glen on July 24, 2019 at 3:12pm — No Comments

- Statistics for Data Science in One Picture
- On Being a 50 Year Old Data Scientist
- Difference Between Standard Deviation and Standard Error in One Picture
- Parametric and Non Parametric Bootstrap, Traditional Parametric: Comparison in One Picture
- Resampling Methods: Bootstrap vs jackknife
- Unsupervised Learning Algorithms in One Picture
- Comparing Data Sets in One Picture

- Statistics for Data Science in One Picture
- Correlation Coefficients in One Picture
- P-Value Explained in One Picture
- On Being a 50 Year Old Data Scientist
- Difference Between Stratified Sampling, Cluster Sampling, and Quota Sampling
- Difference Between Standard Deviation and Standard Error in One Picture
- Decision Tree vs Random Forest vs Gradient Boosting Machines: Explained Simply

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions