We tried to do XYZ. Did it make a difference?”
Whether you are in the for-profit world or the not-for profit world, this is a very basic question that many people try to answer.
You could be working at a bank trying to figure out which offer is most appealing to customers, at an online retailer figuring out which ad display gets the most clicks, at the Department of Education trying to test the effect of smaller class sizes, at the city government office trying to see if the new bike lane programs really are safer, at an online media provider (Netflix, Youtube, Spotify,..) trying to find the best algorithm for making recommendations, a pharmaceutical company is running a clinical trial comparing their drug’s effectiveness versus the competitor, a pharmaceutical company wants to see if its newly released drug has strong impact in the real world…
All of these examples have a few things in common.
Economists and Evaluation Specialists (those with degrees in Monitoring and Evaluation) study many techniques to do program evaluation including Randomized Experiments as well as Quasi-Experimental Methods such as Propensity Score Methods, Instrument Variable, Interrupted Time Series, Regression Discontinuity, Heckman’s 2-Stage Model...
Most data scientists can do A/B Testing like there is no tomorrow. It is a standard part of the Data Scientists toolkit. When successful, the A/B Testing creates a random assignment so that the two groups, A and B, are, on average, very similar in all observable and unobservable characteristics. The program evaluation then simply consists of checking the quality of the randomization (yes, this step get skipped by many people but, it should not be skipped) then comparing the outcomes in Group A to Group B. This is like the way a clinical trial is designed and implemented for a drug.
But what if the randomization failed? What if the groups are different? What if other experiments were going on at the same time that impacted the assignment?
What if randomization is not possible?
In these situations, the toolbox of Program Evaluation becomes critical to determining if the program made a difference in the outcome of interest, whether that be higher click-through rates, increased sales, safer roads, more effective drugs or better education.
The desired skills for a Data Scientist already include quite a long list. Knowing that we can’t add an infinite number of required skills to the Data Scientist Toolbox, what do you think about a basic course in Program Evaluation? Would some training in Program Evaluation be helpful to round out a Data Scientists training?