A One-Page Primer on: Statistical Power

statistical power
power analysis
minimum detectable effect
sample size
research design
Statistical power is the chance to reject the null when it’s false. Why it matters, how to compute it, and why both researchers and readers should care. This is a one-page primer with rules of thumb and key readings.
Author

Carlisle Rainey

Published

August 30, 2025

“Of course, investigators could not have known how underpowered their research was, as their training had not prepared them to know anything about power, let alone how to use it in research planning. One might think that after 1969, when I published my power handbook that made power analysis as easy as falling off a log, the concepts and methods of power analysis would be taken to the hearts of null hypothesis testers. So one might think. (Stay tuned.)” — Cohen (1990)

One-Page Primer Send to My Inbox

What is statistical power?

Statistical power is the chance that your data lead you to reject the null hypothesis if that null hypothesis is incorrect. Power depends on the magnitude of the effect, so you must assume an effect for a power calculation. Claims about power have the form:

“the experiment has % power to detect an effect of .”

Why does power matter to the researcher?

You are a clever theorist and hypothesize that X increases Y; then the null hypothesis is that X does not increase Y. You design an experiment and statistical test. You plan to reject the null hypothesis if the p-value is less than 0.05. Two types of errors can occur.

Type I: First, you might incorrectly reject the null hypothesis. That is, the null is true, but your p-value is less than 0.05 and you reject it.

Type II: Second, you might incorrectly fail to reject the null hypothesis. That is, the null is false (i.e., the research hypothesis is correct), but your p-value is greater than 0.05 and you do not reject the null.

Let’s think more on this second “error.” You’ve paid all the costs of designing and fielding an experiment. But in the end, you cannot distinguish between your research hypothesis and the null hypothesis. This is a wasted opportunity. Researchers care about statistical power because they do not want to waste their time, effort, and money.

Why does power matter to the reader?

When reading a paper, why should you care about power?

My perspective: Power doesn’t matter after running the experiment. All relevant information is in the confidence interval. Caveat: power does change the meaning of “not significant,” but this information is in the CI.

Others’ perspective: Power matters greatly after running the experiment. Authors and reviewers dismiss papers without significant findings. This significance filter combined with low power causes published estimates to diverge from the truth.

How can I compute statistical power?

To compute statistical power, first ask: “for what effect?” Identify the range of substantively interesting effects and choose the smallest effect in this range. This is the smallest effect of substantive interest (SESOI). The experiment should be well-powered for the SESOI.

Fancy software can compute power for you, but I like simple rules of thumb. Experiments have:

  • 80% power to detect effects that are 2.5 times the standard error, and
  • 95% power to detect effects that are 3.3 times the standard error.

As a reader, you can simply multiply the reported standard errors by 2.5 and ask yourself: “is this effect substantively small?” If 2.5 times the reported standard error is a substantively large effect, then the experiment is underpowered.

As a researcher, you can predict the standard error of a simple two-group, difference-in-means comparison using \(\text{SE} = \frac{\sigma}{\sqrt{n}}\), where \(\sigma\) is the standard deviation of the outcome variable and \(n\) is the sample size per condition. You can obtain a suitable value of \(\sigma\) from an existing study of the same outcome and population. Or from a pilot study, if you plan to use one. As a too-crude shortcut in a pinch, you might use range/4.

Regression adjustment can increase power; it shrinks the standard error by a factor of about \(\sqrt{1 - R^2}\), where \(R^2\) is the \(R^2\) of the regression of the predictors on the outcome variable. Unless predictors are highly predictive of the outcome, adjustment doesn’t matter much. However, in pre–post designs, the researcher measures the outcome before and after the treatment and adjusts for the pre-treatment measure. This shrinks the standard error at least 30%, typically 50%, and perhaps 70%. You can obtain a suitable value of \(R^2\) from a pilot study, if you have one.