How to find out if it's correlation or causation

This article was written by Joseph Rickert.

We all “know” that correlation does not imply causation, that unmeasured and unknown factors can confound a seemingly obvious inference. But, who has not been tempted by the seductive quality of strong correlations?

Fortunately, it is also well known that a well done randomized experiment can account for the unknown confounders and permit valid causal inferences. But what can you do when it is impractical, impossible or unethical to conduct a randomized experiment? (For example, we wouldn’t want to ask a randomly assigned cohort of people to go through life with less education to prove that education matters.) One way of coping with confounders when randomization is infeasible is to introduce what Economists call instrumental variables. This is a devilishly clever and apparently fragile notion that takes some effort to wrap one’s head around.

On Tuesday October 20th, we at the Bay Area useR Group (BARUG) had the good fortune to have Hyunseung Kang describe the work that he and his colleagues at the Wharton School have been doing to extend the usefulness of instrumental variables. Hyunseung’s talk started with elementary notions: like explaining the effectiveness of randomized experiments, described the essential notion of instrumental variables and developed the background necessary for understanding the new results in this area. The slides from Hyunseung’s talk available for download in two parts from the BARUG website. As with most presentations, these slides are little more than the mute residue of talk itself. Nevertheless, Hyunseung makes such imaginative used of animation and build slides that the deck is worth working through.

The following slide from Hyunseung’s presentation captures the essence of the instrumental approach.

To read more about Instrumental Variables, click here.

DSC Resources

Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs
Knowledge: Research | Competitions | Webinars | Our Book | Members Only | Search DSC
Buzz: Business News | Announcements | Events | RSS Feeds
Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers

Additional Reading

What statisticians think about data scientists
Data Science Compared to 16 Analytic Disciplines
10 types of data scientists
91 job interview questions for data scientists
50 Questions to Test True Data Science Knowledge
24 Uses of Statistical Modeling
21 data science systems used by Amazon to operate its business
Top 20 Big Data Experts to Follow (Includes Scoring Algorithm)
5 Data Science Leaders Share their Predictions for 2016 and Beyond
50 Articles about Hadoop and Related Topics
10 Modern Statistical Concepts Discovered by Data Scientists
Top data science keywords on DSC
4 easy steps to becoming a data scientist
22 tips for better data science
How to detect spurious correlations, and how to find the real ones
17 short tutorials all data scientists should read (and practice)
High versus low-level data science

How to find out if it's correlation or causation

Leave a Reply Cancel reply