The concepts of p-value and level of significance are vital components of hypothesis testing and advanced methods like regression. However, they can be a little tricky to understand, especially for beginners and good understanding of these concepts can go a long way in understanding advanced concepts in statistics and econometrics. Here, we try to simplify the concept in an easy, logical manner. Hope this helps.
P-value
In hypothesis testing, we set a null hypothesis (lets say mean x = 10), and then using a sample, test this hypothesis. After testing the hypothesis, we get a result (lets say x = 12). Now with p value, we obtain a probability that given than the population mean was 10, what is the probability that we get a sample mean of 12.
If that probability is too low, we reject the null hypothesis, that is, we say that based on current evidence and testing, the null hypothesis is not true. If that probability is too high, we accept the null hypothesis, that is, we say that based on current evidence and testing, the null hypothesis is true. This probability is the p-value. It is a result that we obtain after conducting our statistical test (e.g regression).
To explain more, it is important to understand what we are trying to do. We have a population. We are assuming something about that population (lets say mean i.e. x = 10) and now we want to test from a given sample whether it is true or not that the mean is 10. Now how do we do that? We perform our statistical test with the sample (and NOT the population). We get the result. Lets say the result is x=12.
Now, it is important to understand what we have assumed and what we have got. We have assumed that the population mean is 10, and we have got the result that sample mean is 12. In a sense, assumed population mean is an assumption and sample mean is a result that we have obtained. assumed mean is an assumption, a possibility. It is what we are assuming the value to be. Sample mean is a result that we have obtained after performing the test.
Now we have to verify whether what we have obtained (sample mean) is consistent with what we have assumed (population mean). In other words, what are the chances of getting the result (sample mean) if the assumption is actually true (population mean). What are the chances that sample mean is 12, under the assumption that population mean is 10? That chance or probability is called as p-value.
If that p-value is low, it means that the chances were very low to obtain the sample mean as 12, if the assumption that population mean is 10 was true. Thus, something is wrong. Sample mean cannot be wrong, as it is our result. It is what our sample data says. Thus, the only thing that can be wrong is the assumption of population mean. In other words, it appears that the assumption that population mean is 10 (our null hypothesis) is itself wrong and we should reject that. In this case, we say that our result is SIGNIFICANT, which means that from our results, we have concluded that our sample mean is significantly different from our population mean.
If that p-value is high, it means that the chances were very high to obtain the sample mean as 12, if the assumption that population mean is 10 was true. Thus, it appears that the assumption that population mean is 10 (our null hypothesis) is right and we should accept (or not reject it). In this case, we say that our result is INSIGNIFICANT, which means that from our results, we have concluded that our sample mean is NOT significantly different from our population mean.
Level of significance
Now, the next question is, how do we know that the p-value or the probability we have obtained after our statistical test is too high or too low to accept or reject the null hypothesis. Is 0.03 or 3% too low or too high, is 0.07 to 7% too low or too high.
To decide, whether the p-value is too low or too high, we have to set a standard (as a checkpoint or a benchmark). If the obtained p-value is lesser than that standard, we conclude that the p-value is too low or our results are significant and we should reject the null hypothesis. If the obtained p-value is higher than that standard, we conclude that the p-value is too high or our results are insignificant and we should accept the null hypothesis.
This standard or checkpoint that we set is called LEVEL OF SIGNIFICANCE. It is upon us as a statistical investigator to choose our level of significance. Most often, level of significance of 5% is chosen as a standard practice. However, levels like 1% and 10% can also be chosen.
e.g if our p-value is 0.07, we say that out results are insignificant at 5% level (and we should accept our null hypothesis at this level) and are significant at 10% level (and we should reject our null hypothesis at this level).
Comment
If that probability is too low, we reject the null hypothesis, that is, we say that based on current evidence and testing, the null hypothesis is not true.
Perhaps it's just a matter of semantics but I think it's important to note that a hypothesis test cannot prove or disprove a null hypothesis. You can only reject or fail to reject it. If you reject the null, all you are saying is that your sample data suggests strong evidence in favour of your alternative hypothesis and that the sample mean is statistically different from the null's hypothesized true mean.
When you set a significance level of, say, 0.05, you are setting the probability of committing a type I error, which is rejecting the null when it fact it was true. Again, we don't know if the null is true or not but if it were true and we rejected the null, that would be a type I error. At 0.05, if we were to take many many samples, we'd expect about 5% of them to have 95% confidence intervals that do not capture the true mean.
So, our doubt here is that our single sample might be one of the 5% that doesn't capture the true mean and made us reject the null when if fact we shouldn't have. This is why you can't prove or disprove the null.
I recently attended the ASA’s Symposium on Statistical Inference with Peter Bruce, the founder of Statistics.com. In a chat with the co-chair, Peter asked, partly tongue-in-cheek, whether the real problem was too much research chasing too few real results. Scientific American Online’s opinion editor liked the topic, and the result was the following article -https://blogs.scientificamerican.com/observations/are-scientists-do...
© 2020 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Upcoming DSC Webinar
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Upcoming DSC Webinar
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central