Power calculation for difference in proportions

code{white-space: pre;}

pre:not([class]) {

background-color: white;

}

if (window.hljs && document.readyState && document.readyState === “complete”) {

window.setTimeout(function() {

hljs.initHighlighting();

}, 0);

}

.main-container {

max-width: 940px;

margin-left: auto;

margin-right: auto;

}

# Power calculation for difference in proportions

*M Loecher*

*Friday, September 26, 2014*

I almost posted the following on stackexchange, but as so often before, the process of writing down my question led to its natural resolution!

Dear all, I am trying to understand the origin of the actual formula used to compute the power of the `power.prop.test()`

function in R, which is defined in lines 12-14 of the source:

```
tside <- switch(alternative, one.sided = 1, two.sided = 2)
p.body <- quote(pnorm(((sqrt(n) * abs(p1 - p2) - (qnorm(sig.level/tside,
lower.tail = FALSE) * sqrt((p1 + p2) * (1 - (p1 + p2)/2))))/sqrt(p1 *
(1 - p1) + p2 * (1 - p2)))))
```

I took the liberty of rewriting the expression inside the `pnorm()`

function in readable notation: \[

\frac{ \sqrt{n} \cdot |p_1 – p_2| – z_{1-\alpha/k}

\cdot \sqrt{(p_1 + p_2) \cdot (1 – (p_1 + p_2)/2)}}{\sqrt{p_1 \cdot (1 – p_1) + p_2 \cdot (1 – p_2)}}

\] where k(=tside) is either 1 or 2 depending on the *alternative* argument. I do not quite understand the 2nd term in the numerator, there is no reference in the man page. In particular, if I interpret it as a reference value, I am surprised to see it depend on \(p_2\)!

In addition, this power function does not agree with my naive version below. Let us fix \(p_1=0.04, p_2=0.05\), we are told (`power.prop.test(p1=0.04,p2=0.05,power=0.95)`

) that we need a sample size of *11166* in each group to achieve a power of 0.95. If I was to test the hypothesis \(H_0: p_1=p_2\) my critical value would be simply `tr=qnorm(1-0.05/2, mean=0,sd=sr)=`

which does NOT depend on \(p_2\). Here \(sr=\sqrt{p_1 \cdot (1 – p_1) + p_1 \cdot (1 – p_1)}\) is the sample standard deviation of the difference in proportions (under the assumption that \(H_0\) holds).

This is where I realized my fallacy: I was about to substitute \(p_1=0.04\) in the expression for sr. But that would be testing the following Null hypothesis: \(H_0: p_1=p_2=0.04\) which is a very different conjecture! So instead, we simply estimate the common \(p\) by the average \((p_1 + p_2)/2\) and that is exactly how the formula above is formed !

// add bootstrap table styles to pandoc tables

$(document).ready(function () {

$(‘tr.header’).parent(‘thead’).parent(‘table’).addClass(‘table table-condensed’);

});

(function () {

var script = document.createElement(“script”);

script.type = “text/javascript”;

script.src = “https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML”;

document.getElementsByTagName(“head”)[0].appendChild(script);

})();