# Determining sample sizes is a challenging undertaking. For simplicity, I've limited this picture to the one of the most common testing situation: testing for differences in means. Some assumptions have been made (for example, normality and equal sample sizes). Additionally, the formulas shown are for one-tailed tests. Usually, a small tweak (e.g. replacing Zα by Zα/2) is all that's required for two-tailed tests. Click on the picture to zoom in

The picture shows the traditional (frequentist) route for determining sample size. Another possible route is to use Bayesian methods. Although becoming more popular, none of the major software programs include those methods and--unlike the frequentist route—no standard Bayesian methods exist for determining sample size. If you're interested in Bayesian methods, refer to Chapter 13 in Chow, et al, 2008.

## References

Chow et. al (2017). Sample Size Calculations in Clinical Research. CRC Press

Ryan, T. (2013). Sample Size Determination and Power. Wiley.

DSC Resources

Views: 10505

Comment

Join Data Science Central Comment by Stephanie Glen on March 22, 2019 at 5:11am

@Frank Deruyck Ah, that's where it gets complicated! You can't solve for n directly (as you noted). If you use the equation above it (for known std dev) to find "n", you'd have an estimate, but it would be an underestimate. So at this point you'd have a couple of choices: use the calculated "Z" sample size equation as a baseline, and guess a but larger. Or, use software to get "n", then use the equation to check the software's solution. Either way, it's really a guesstimate scenario, because if your population standard deviation is unknown, you're entering the statistical unknown. Comment by Stephanie Glen on March 22, 2019 at 4:58am

@John Williams

I can see why you might think that! However, my intent was merely to show the process with an example. It can be tweaked slightly for many more situations. For example, the process for comparing variances is very, very similar. Comment by Frank Deruyck on March 21, 2019 at 10:00pm

Hi Stephanie, in sample size calculation with unknow standard deviation a t value needs to be specified depending on n however this n is unknown and needs to be computed so how to specify the this t(alpha,n-1) value? Comment by John Williams on March 21, 2019 at 12:10pm
Sigh. A misleading title. Sample size for a difference in two means, not sample size in general. Clickbait at its worst.