Subscribe to DSC Newsletter

Stephanie Glen's Blog (101)

4 Common Data Analysis Mistakes to Watch Out For

  • Statistical analysis is poorly misunderstood by many, resulting in array of problems.
  • There are many ways to ruin your data analysis.
  • 4 of the less well-known mistakes to watch out for.  

I recently came across an article in eLife (a peer reviewed journal) called Ten common statistical mistakes to watch out for when writing or reviewing a manuscript [1]. Although the paper was geared towards reviewers of research papers, many of the…

Continue

Added by Stephanie Glen on January 19, 2021 at 8:16am — No Comments

Generative vs. Discriminative Models in One Picture

  • Discriminative and generative models have distinct differences,
  • Discriminative methods are simpler but not necessarily better,
  • This One Picture outlines a few major differences between the methods, along with a few examples and use cases.

Read any textbook on the difference between generative and discriminative models and…

Continue

Added by Stephanie Glen on January 11, 2021 at 11:39am — No Comments

Fraudulent Covid-19 Data and Benford's Law

An interesting application of Benford's and Zipf's Laws to fraudulent Covid-19 data was published…

Continue

Added by Stephanie Glen on December 31, 2020 at 4:33am — No Comments

Gross Underestimates and Overestimates from the Same Data: Covid-19 Death Rates Example

  • Ten-fold differences have been reported in Covid-19 death rates.
  • Estimate issues are because of how the data is calculated.
  • Why the same data can yield gross overestimates and underestimates.

What is the actual death rate for Covid-19? After nearly a year of the pandemic, no one can…

Continue

Added by Stephanie Glen on December 24, 2020 at 6:00am — 1 Comment

Common Statistical Errors

  • Data analysis is fraught with pitfalls, including too-small sample sizes.
  • Bias can creep into the most well-intentioned of studies,
  • Tips to avoid bias and choose the best statistical test.

So you've formed your breakthrough hypothesis, created a bulletproof test procedure and waited eagerly for the results to come in. To your surprise, the magnificent effect you were sure of just isn't there. What went wrong? Was it the way your hypothesis was worded? An…

Continue

Added by Stephanie Glen on December 16, 2020 at 4:30am — No Comments

The Worst Covid-19 Misleading Graphs

  • Graphs are a great tool to appeal to a wide audience,
  • They are often used to deliberately mislead, not inform,
  • A few creative ways graphs have been used this year to distort Covid-19 facts.

Good graphs are powerful tools to convey data, but they can be skewed to fit an agenda. The worst  graphs typically misuse visual proximity, manipulate data, and omit important details from chart titles and captions [1]. While it's fairly easy to spot a…

Continue

Added by Stephanie Glen on December 7, 2020 at 3:55pm — No Comments

Bar Chart vs Column Chart vs Histogram

  • Column charts, bar charts and histograms appeal to a wide audience.
  • Each chart has distinct differences.
  • Specific circumstances for when you should use each type.

In an article I posted last week, How to Choose the Right Graph for Your Data (In One Picture),…

Continue

Added by Stephanie Glen on November 30, 2020 at 11:30pm — No Comments

How to Choose the Right Graph for Your Data (In One Picture)

  • Choosing the right graph depends largely on your audience.
  • Dozens of graphs fit various data types.
  • While software can help with the design, you still have to choose a graph.
  • A flow chart to help with the decision.

Choosing the best graph for your data poses a challenge. Even if you get a program to do the hard work for you, you still have to make a choice between dozens of potential…

Continue

Added by Stephanie Glen on November 27, 2020 at 7:42am — No Comments

Periodic Function vs Non Periodic, Aperiodic, Quasiperiodic (In one picture)

  • An understanding of periodic functions is essential for any data scientist,
  • Periodic functions can be represented by Fourier series,
  • Some Non-Periodic functions can also be represented by Fourier series,
  • One picture to explain their similarities and dissimilarities.

Periodic functions[noterm], which repeat values at set intervals,…

Continue

Added by Stephanie Glen on November 19, 2020 at 3:07am — No Comments

Probability Mass Function vs Probability Density Function vs Cumulative Density Function (In One Picture)

  • This one picture shows how the CDF compares with the PDF and PMF.
  • All can be used to calculate probabilities.
  • Each function has a unique purpose.
  • The Cumulative Density Function (CDF) is the easiest to understand [1].

References:…

Continue

Added by Stephanie Glen on November 10, 2020 at 4:54am — No Comments

Conjunction vs Disjunction: Bad Apples and Other Analogies

  • Conjunctions and disjunctions are useful tools for building algorithms. 
  • They enable you to combine propositions. 
  • Truth tables are a fast way to find solutions. 
  • Analogies can help you to remember the results. 

Dive into machine learning, and you'll come across…

Continue

Added by Stephanie Glen on October 29, 2020 at 6:17am — No Comments

Odds vs Probability vs Chance

Data Points

  • There are a number of different terms used for probability in statistics.
  • Each has a distinct (and usually precise)…
Continue

Added by Stephanie Glen on October 24, 2020 at 4:30am — No Comments

Types of Variables in Data Science in One Picture

While there are several dozen different types of possible variables, all can be categorized into a few basic areas. This simple graphic shows you how they are related, with a few examples of each type. 

More info:…

Continue

Added by Stephanie Glen on October 17, 2020 at 4:00pm — No Comments

5 Rules of Probability in One Picture (Cat and Dog Edition)

Knowledge of the basic rules of probability is a must-have for any data scientist. But if you're a visual learner like me, learning the algebraic representations of the 5 basic rules of probability (i.e. P(A) + P(B) = 1) is a challenge. I've never been very good at memorizing formulas, but images stick in my head like ear worms.  Whenever I come across a new formula, I try to make it visual with a picture or doodle. The following picture shows some of the images I created to…

Continue

Added by Stephanie Glen on October 10, 2020 at 8:33am — No Comments

Why You Need to Know Those Probability Distributions

If you're in the beginning stages of your data science credential journey, you're either about to take (or have taken) a probability class. As part of that class, you're introduced to several different probability distributions, like the binomial distribution,…

Continue

Added by Stephanie Glen on September 30, 2020 at 3:00pm — No Comments

Correlation Coefficients in Data Science and Machine Learning (in One Picture)

In my first post on correlation coefficients, I outlined the differences between five popular coefficients: Pearson's,…

Continue

Added by Stephanie Glen on September 25, 2020 at 10:00am — No Comments

Stumped by Bayes' Theorem? Try This Simple Workaround

Bayes' Theorem formula.

Bayes' Theorem, which The Stanford Encyclopedia of Philosophy calls "...a simple mathematical formula" can be surprisingly difficult to actually solve. If you struggle with Bayesian logic, solving the "simple"…

Continue

Added by Stephanie Glen on September 15, 2020 at 11:29am — No Comments

MicroMasters: The Fast Way to Get Into Data Science

If the prospect of earning a masters degree in data science sounds too daunting (and expensive), then a MicroMasters  might be a good fit for you. A MicroMasters is a "mini" masters degree, typically comprised of four courses. The courses are offered at a fraction of the cost of a typical masters program (around a tenth of the cost), so are a great way to wet your feet and see if data science is right for you.

I'm actually enrolled in the MIT MicroMasters program…

Continue

Added by Stephanie Glen on September 9, 2020 at 8:30am — 1 Comment

Model Fitting Tests You've Probably Never Heard Of (In One Picture)

When choosing a statistical test, you generally want to go for one of the more well-known ones, like the chi-square goodness of fit test.That's because more people are going to be able to understand your results, and you have the backing of a slew of…

Continue

Added by Stephanie Glen on August 31, 2020 at 6:30am — No Comments

Real Life Applications of Logarithms in Data Science and Beyond

Ah, the logarithm. It's the black sheep of the mathematics family, loved by a few slide-rule wielding, grey-haired professors and math Olympiad competitors. For the rest of us, logarithms remain on the "I'll get back to understanding that later when I see the point" shelf. However,…

Continue

Added by Stephanie Glen on August 27, 2020 at 11:50am — No Comments

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service