- Statistical analysis is poorly misunderstood by many, resulting in array of problems.
- There are many ways to ruin your data analysis.
- 4 of the less well-known mistakes to watch out for.

I recently came across an article in eLife (a peer reviewed journal) called *Ten common statistical mistakes to watch out for when writing or reviewing a manuscript* [1]. Although the paper was geared towards reviewers of research papers, many of the…

Added by Stephanie Glen on January 19, 2021 at 8:16am — No Comments

- Discriminative and generative models have distinct differences,
- Discriminative methods are simpler but not necessarily better,
- This One Picture outlines a few major differences between the methods, along with a few examples and use cases.

Read any textbook on the difference between generative and discriminative models and…

ContinueAdded by Stephanie Glen on January 11, 2021 at 11:39am — No Comments

An interesting application of Benford's and Zipf's Laws to fraudulent Covid-19 data was published…

ContinueAdded by Stephanie Glen on December 31, 2020 at 4:33am — No Comments

- Ten-fold differences have been reported in Covid-19 death rates.
- Estimate issues are because of how the data is calculated.
- Why the same data can yield gross overestimates
*and*underestimates.

**What is the actual death rate for Covid-19?** After nearly a year of the pandemic, *no one can…*

Added by Stephanie Glen on December 24, 2020 at 6:00am — 1 Comment

- Data analysis is fraught with pitfalls, including too-small sample sizes.
- Bias can creep into the most well-intentioned of studies,
- Tips to avoid bias and choose the best statistical test.

So you've formed your breakthrough hypothesis, created a bulletproof test procedure and waited eagerly for the results to come in. To your surprise, the magnificent effect you were sure of just isn't there. What went wrong? Was it the way your hypothesis was worded? An…

ContinueAdded by Stephanie Glen on December 16, 2020 at 4:30am — No Comments

- Graphs are a great tool to appeal to a wide audience,
- They are often used to deliberately mislead, not inform,
- A few creative ways graphs have been used this year to distort Covid-19 facts.

Good graphs are powerful tools to convey data, but they can be skewed to fit an agenda. The worst graphs typically misuse visual proximity, manipulate data, and omit important details from chart titles and captions [1]. While it's fairly easy to spot a…

ContinueAdded by Stephanie Glen on December 7, 2020 at 3:55pm — No Comments

- Column charts, bar charts and histograms appeal to a wide audience.
- Each chart has distinct differences.
- Specific circumstances for when you should use each type.

In an article I posted last week, How to Choose the Right Graph for Your Data (In One Picture),…

ContinueAdded by Stephanie Glen on November 30, 2020 at 11:30pm — No Comments

- Choosing the right graph depends largely on your audience.
- Dozens of graphs fit various data types.
- While software can help with the design, you still have to choose a graph.
- A flow chart to help with the decision.

Choosing the best graph for your data poses a challenge. Even if you get a program to do the hard work for you, you still have to make a choice between dozens of potential…

ContinueAdded by Stephanie Glen on November 27, 2020 at 7:42am — No Comments

- An understanding of periodic functions is essential for any data scientist,
- Periodic functions can be represented by Fourier series,
- Some Non-Periodic functions can also be represented by Fourier series,
- One picture to explain their similarities and dissimilarities.

Periodic functions[noterm], which repeat values at set intervals,…

ContinueAdded by Stephanie Glen on November 19, 2020 at 3:07am — No Comments

- This one picture shows how the CDF compares with the PDF and PMF.
- All can be used to calculate probabilities.
- Each function has a unique purpose.
- The Cumulative Density Function (CDF) is the easiest to understand [1].

References:…

ContinueAdded by Stephanie Glen on November 10, 2020 at 4:54am — No Comments

- Conjunctions and disjunctions are useful tools for building algorithms.
- They enable you to combine propositions.
- Truth tables are a fast way to find solutions.
- Analogies can help you to remember the results.

Dive into machine learning, and you'll come across…

ContinueAdded by Stephanie Glen on October 29, 2020 at 6:17am — No Comments

**Data Points**

- There are a number of different terms used for probability in statistics.
- Each has a distinct (and usually precise)…

Added by Stephanie Glen on October 24, 2020 at 4:30am — No Comments

While there are several dozen different types of possible variables, all can be categorized into a few basic areas. This simple graphic shows you how they are related, with a few examples of each type.

More info:…

ContinueAdded by Stephanie Glen on October 17, 2020 at 4:00pm — No Comments

Knowledge of the basic rules of probability is a must-have for any data scientist. But if you're a visual learner like me, learning the algebraic representations of the **5 basic rules of probability** (i.e. P(A) + P(B) = 1) is a challenge. I've never been very good at memorizing formulas, but images stick in my head like ear worms. Whenever I come across a new formula, I try to make it visual with a picture or doodle. The following picture shows some of the images I created to…

Added by Stephanie Glen on October 10, 2020 at 8:33am — No Comments

If you're in the beginning stages of your data science credential journey, you're either about to take (or have taken) a probability class. As part of that class, you're introduced to **several different probability distributions**, like the binomial distribution,…

Added by Stephanie Glen on September 30, 2020 at 3:00pm — No Comments

In my first post on correlation coefficients, I outlined the differences between five popular coefficients: Pearson's,…

ContinueAdded by Stephanie Glen on September 25, 2020 at 10:00am — No Comments

Bayes' Theorem, which The Stanford Encyclopedia of Philosophy calls "...a simple mathematical formula" can be surprisingly difficult to actually solve. If you struggle with Bayesian logic, solving the "simple"…

ContinueAdded by Stephanie Glen on September 15, 2020 at 11:29am — No Comments

If the prospect of earning a masters degree in data science sounds too daunting (and expensive), then a **MicroMasters** might be a good fit for you. A MicroMasters is a "mini" masters degree, typically comprised of four courses. The courses are offered at a fraction of the cost of a typical masters program (around a tenth of the cost), so are a great way to wet your feet and see if data science is right for you.

I'm actually enrolled in the MIT MicroMasters program…

ContinueAdded by Stephanie Glen on September 9, 2020 at 8:30am — 1 Comment

When choosing a statistical test, you generally want to go for one of the more well-known ones, like the chi-square goodness of fit test.That's because more people are going to be able to understand your results, and you have the backing of a slew of…

ContinueAdded by Stephanie Glen on August 31, 2020 at 6:30am — No Comments

Ah, the logarithm. It's the black sheep of the mathematics family, loved by a few slide-rule wielding, grey-haired professors and math Olympiad competitors. For the rest of us, logarithms remain on the "I'll get back to understanding that later when I see the point" shelf. However,…

ContinueAdded by Stephanie Glen on August 27, 2020 at 11:50am — No Comments

- 4 Common Data Analysis Mistakes to Watch Out For
- Generative vs. Discriminative Models in One Picture
- Fraudulent Covid-19 Data and Benford's Law
- Gross Underestimates and Overestimates from the Same Data: Covid-19 Death Rates Example
- Common Statistical Errors
- The Worst Covid-19 Misleading Graphs
- Bar Chart vs Column Chart vs Histogram

- Regression Analysis in One Picture
- Best Languages for Data Science and Statistics in One Picture
- Gross Underestimates and Overestimates from the Same Data: Covid-19 Death Rates Example
- Math for Data Science in One Picture: What do you REALLY need to study?
- Statistics for Data Science in One Picture
- Difference Between Stratified Sampling, Cluster Sampling, and Quota Sampling
- Significance Level vs Confidence level vs Confidence Interval

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions