Comments - Understanding Type I and Type II Errors - Data Science Central2019-10-17T00:04:38Zhttps://www.datasciencecentral.com/profiles/comment/feed?attachedTo=6448529%3ABlogPost%3A751983&xn_auth=noThanks Larry for the added pe…tag:www.datasciencecentral.com,2018-12-05:6448529:Comment:7836322018-12-05T21:51:43.159ZBill Schmarzohttps://www.datasciencecentral.com/profile/BillSchmarzo
<p>Thanks Larry for the added perspective.</p>
<p>Thanks Larry for the added perspective.</p> I routinely practice stati…tag:www.datasciencecentral.com,2018-08-24:6448529:Comment:7531772018-08-24T00:33:46.096ZLarry Berkhttps://www.datasciencecentral.com/profile/LarryBerk413
<p> I routinely practice statistical testing, and yet I admittedly still find the terms 'positive' and 'negative' confusing. The aim of a statistical test is 'find a meaningful difference between statistics' (means, variances, etc.). In that context (as opposed to Bill's business examples), a NULL hypothesis is always that "in probabilistic terms, there is no difference". </p>
<p> Consequently, the outcome the statistical test desires is *REJECT* the NULL hypothesis. This is the…</p>
<p> I routinely practice statistical testing, and yet I admittedly still find the terms 'positive' and 'negative' confusing. The aim of a statistical test is 'find a meaningful difference between statistics' (means, variances, etc.). In that context (as opposed to Bill's business examples), a NULL hypothesis is always that "in probabilistic terms, there is no difference". </p>
<p> Consequently, the outcome the statistical test desires is *REJECT* the NULL hypothesis. This is the affirming, 'positive' outcome (for me the analyst). Likewise, the disappointing, 'negative' outcome, is that probabilistically, there is no meaningful difference in the statistics (and, I have to rethink what I had hoped to demonstrate).</p>
<p> Therefore, I find it helpful to mentally swap the word 'positive' for the entire phrase 'REJECT the NULL hypothesis'. This reinforces in my head that:</p>
<p> 1) the word positive is not a substitute for the word 'true'; a 'positive finding' means the statistics (groups, entities, etc.) being compared are *in fact* different in probabilistic terms'.</p>
<p> 2) 'negative' is not a substitute for the word 'false'; a 'negative finding' means that the statistics (groups, entities, etc.) being compared are *in fact* no different, also in probabilistic terms'. </p>
<p> Now I can keep the phrases straight:</p>
<p> 'false positive' -- incorrectly concluding a difference when none really exists,</p>
<p> 'false negative' -- incorrectly concluding no difference when a significant one really does exist</p>
<p></p>
<p> It's easy to get confused because the 'confusion matrix' (no pun intended) in for evaluating Classification algorithm performance also uses terms 'true positive', 'false negative', etc. In this context 'positive' *is* effectively a substitute for the word 'true' and 'negative' *is* effectively substitute for the word 'false'. </p>
<p> Larry Berk,</p>
<p> Data Scientist, Hitachi Vantara</p>