If you've spent any time with modeling data, you'll know that there are many pitfalls to be had when it comes to data presentation (I addressed some common pitfalls in Misleading Graphs Part 1). Misleading graphs can be the result of incorrect data collection, ignorance of the basic "rules" of data presentation (like labeling axes), or even deliberate attempts to mislead. A fourth pitfall is more nuanced: despite your best efforts, **you can dilute your message with how you present the data. **

These examples show how easy it is to muddle a simple message.

I wanted to know the answer to the question: *How much income do I need to be happy?*

I looked to one of my favorite studies of all time, *High income improves evaluation of life but not emotional well-being* by Nobel Prize-winning economist Angus Deaton and colleague Daniel Kahneman of Princeton University. The team investigated how much money is the "sweet spot" of happiness. Let's take a look at the graph that accompanied the article. The first thing I noticed was the "Ladder" (after all, it's the focal point in BOLD):

I thought perhaps the ladder (and the "Mean ladder on the right hand y-axis) were related to a *corporate* ladder (because we *are* discussing incomes here) but I was completely wrong. According to the authors, the Ladder is actually "...the average reported number on a scale of 0–10, marked on the right-hand scale". So **not only does it not look like a ladder at all, it doesn't sound like one either** (what ladder do you know that's marked on a scale of 0 to 10)? And while we're discussing that scale, note that the authors mention 0 to 10 but the scale on the graph goes from 5.5 to 6.5. Why cut the scale off? If you wanted to know where the ladder hits the ground (presumably at an income of a few dollars), you won't find it on this graph.

But before we ruminate for much longer over that ladder, let's not forget that we're here to find out how much money we need to be happy--not to decipher the ladder's purpose. That ladder was a big distraction to me, and it's likely to be a big distraction to a big chunk of your audience. The takeaway: **don't put ladders, buckets, horses, or other strong imagery in your graph** without fully taking into account what your reader will envisage.

But back to the question at hand. The "happiest income level" is buried on the above graph under fractions and means, plus a spaghetti junction of nine different lines (including two y-axes!). According to The Guardian, one fifth of adults have forgotten how to do fractions and even fewer remember how to calculate the mean. And with the average American reading at an eighth grade level, is the average person in your company going to be able to glean an answer from a graph like this one? Probably not.

If I were to glance at the graph and guess what the magic number might be, I might pick the intersection of "Ladder" and "Positive Effect": 60,000. And I would be wrong.

Let's look at another graph, produced with a similar question in mind.

A second study along the same lines was written by Andrew Jebb, Louis Tay, Ed Diener, and Shigehiro Oishi, titled “Happiness, Income Satiation and Turning Points around the World”.

Graph: Courtesy of Richard Carrier

(*Sidebar*: Perhaps oddly, the first thing I noticed about this graph is that, unlike the first graph in this article, the horizontal axis is labeled correctly (with a $ sign). This is a small, but important note: make sure your axes are labeled!)

I'm still looking for a simple answer to a simple question. **And I'm not getting it from this graph. **

It suffers from an abundance of acronyms*.* "AUS" is a clue that it's a world view, but that's as far as I got. I had to consult Google to figure out what EE, SE, AF and LA stand for. Eritrea, Singapore, Afghanistan and Louisiana were what sprung to mind. Yes, **I know Louisiana isn't a country** but, living very close to Louisiana (I'm in Florida), my brain saw LA and fired up the Louisiana neuron. In case you're as acronym-challenged as I, the correct answers are Estonia, Sweden, Afghanistan and Laos. I spent more time trying on the wild Google chase of figuring out what those stood for than I spent actually looking for the answer to my question. I'm sure the codes were explained in the study, but **they should have put them on the graph.**

This next graph, from Slate.com, requires no explanation.

**Past 100k, everyone's happy.** A simple graph, for a simple question--and not a wild goose chase in sight. It might not make it into an academic journal, but as a data scientist, that's probably not your goal.

This example illustrated some of the pitfalls to be had with presenting data. As a data scientist, you are probably going to be relaying information that's technical, industry-specific, or just plain dull. It's your job to take those uninteresting details and present them in an interesting,informative way that can be understood by everyone--quickly, and without having to reference other sources to understand your presentation.

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central