Whenever we make a decision in business, we test a hypothesis, no matter if it is in product, marketing or sales, at the end we make assumptions that will guide our actions. When we say that we will implement the next feature, or run this campaign we make a hypothesis that this particular action will have some positive impact to what we have set as a goal. The goal could be our revenues, our signups, the time it takes for a customer to use the product, put anything you want here.
You might wonder why I’m making these obvious statements. There’s a lot of buzz about statistics, data science machine learning etc. What these fields actually do is to codify in a scientific way processes that everyone inside a business executes. This new, more scientific, way of doing things has some serious advantages and that’s how all this hype is justified. At the end, it is just a more formal way of doing all the things that you already do. Hopefully by exposing the relevance between the already established practices and this new scientific way, it will also become more accessible and less frightening to anyone without a technical or scientific background.
Let’s start with a simple use case. Based on our metrics, we believe that when a new user sign-ups for the first time on our product, it takes a lot of time to actually figure out what to do with it. Now, this is something easily measured with a tool like Mixpanel, Segment or Intercom. We capture events that the user generates on the product and we measure the time it takes between two subsequent events. In this case, the first event would be the login event and the second could be any event/action on the product.
To improve the time it takes for the user to start using the product, we decided to introduce an intro video. So when she logs on for the first time that video explains what the product has to offer. Hopefully, by watching this video the user will be more educated about the capabilities of the product and the time it takes to start using it will be reduced.
So what a product manager would do is the following:
We are comparing two funnels that are either created using two different groups of people for the same period, the new signups who saw the video and those who didn’t. Or the product manager could use the same funnel for the sign ups that happened before the video was introduced.
So far the product manager has done the following:
There’s a very basic concept in statistics that is called “hypothesis testing”.
“Hypothesis testing is the use of statistics to determine the probability that a given hypothesis is true.”
Statistical hypothesis testing also has four steps, these are the following:
Now, the above might scare you a bit but at the end, it’s the formal way statistics offer to do the same steps that the product manager did. Let’s see how they relate.
1. The Null Hypothesis is the mean time between two funnel steps for the case where the video is not played to the new signup. The Alternative Hypothesis is that the mean time after the video is introduced will be smaller. This is exactly what the product manager is also measuring by comparing the two funnels we described earlier.
2. Instead of creating the funnels a statistician, based on the nature of the events that are measured, would choose a statistic and use that to calculate the p-values also for step 3. Here we are getting a bit technical but actually, it’s just a different tool that substitutes the funnels.
3. Instead of comparing the mean times directly and making a more qualitative assessment, we are using the p-value and another threshold to see if the hypothesis is likely to be true or not.
Now, we need to pay close attention to the word “likely”. When the product manager uses the funnels she is actually making a qualitative assessment of the validity of the hypothesis. It is not anything certain but if for example a big different between the mean times is observed she will feel comfortable to say that it was a right choice to introduce the video. If again the difference is marginal she will go back on the white board with the team to figure out new ways of fixing the problem they have. When we use statistics we again use the word “likely” again there’s not absolute certainty in the results, the major difference between what the product manager was doing and these statistical techniques is that this certainty (or lack of it) is quantified and produced by a totally controlled process.
Yes, there are quite a few new words introduced but hopefully, I managed to show you the affinity of the statistical techniques and the everyday life of someone who’s not a statistician. Of course, the product manager will not do the work of a data scientist and start using Chi-Square and Student’s tests or write down confidence intervals instead of product roadmaps. But at least by understanding how the arsenal of a data scientist related to our work will help us to embrace it and feel more comfortable with what it has to offer and also its limitations.