机器算法验证 - Power analysis for binomial data when the null hypothesis is that p=0p=0 - 吾爱随笔录

Power analysis for binomial data when the null hypothesis is that p=0p=0

机器算法验证 hypothesis-testing sample-size statistical-power

2022-03-10 17:16:38

I'd like to do a power analysis for a single sample from binomial data, with $H_0: p = 0$ $H_1: p = 0.001$ $p$ $0 < p <1$ $\chi^2$ $p =0$

3个回答

You have a one-sided, exact alternative hypothesis $p_{1} > p_{0}$ $p_{1} = 0.001$ $p_{0} = 0$

The first step is to identify a threshold $c$ $c$ $n$ $\alpha = 0.05$ $c=1$ $n \geqslant 1$ $\alpha > 0$
The second step is to find out the probability to get at least $c$ successes in a sample of size $n$ under the alternative hypothesis - this is your power. Here, you need a fixed $n$ such that the Binomial distribution $\mathcal{B}(n, p_{1})$ is fully specified.

The second step in R with $n = 500$ :

> n  <- 500                 # sample size
> p1 <- 0.001               # success probability under alternative hypothesis
> cc <- 1                   # threshold
> sum(dbinom(cc:n, n, p1))  # power: probability for cc or more successes given p1
[1] 0.3936211

To get an idea how the power changes with sample size, you can draw a power function: enter image description here

nn   <- 10:2000                 # sample sizes
pow  <- 1-pbinom(cc-1, nn, p1)  # corresponding power
tStr <- expression(paste("Power for ", X>0, " given ", p[1]==0.001))
plot(nn, pow, type="l", xaxs="i", xlab="sample size", ylab="power",
     lwd=2, col="blue", main=tStr, cex.lab=1.4, cex.main=1.4)

If you want to know what sample size you need to achieve at least a pre-specified power, you can use the power values calculated above. Say you want a power of at least $0.5$ .

> powMin <- 0.5
> idx    <- which.min(abs(pow-powMin))  # index for value closest to 0.5
> nn[idx]     # sample size for that index
[1] 693

> pow[idx]    # power for that sample size
[1] 0.5000998

So you need a sample size of at least $693$ to achive a power of $0.5$ .

You can answer this question easily with the pwr package in R.

You will need to define a significance level, power, and effect size. Typically, significance level is set to 0.05 and power is set to 0.8. Higher power will require more observations. Lower significance level will decrease power.

The effect size for proportions used in this package is Cohen's h. The cutoff for a small h is often taken to be 0.20. The actual cutoff varies by application, and might be smaller in your case. Smaller h means more observations will be required. You said your alternative is $p = 0.001$ . That is very small

> ES.h(.001, 0)
[1] 0.0632561

But we can still proceed.

 > pwr.p.test(sig.level=0.05, power=.8, h = ES.h(.001, 0), alt="greater", n = NULL)

 proportion power calculation for binomial distribution (arcsine transformation) 

          h = 0.0632561
          n = 1545.124
  sig.level = 0.05
      power = 0.8
alternative = greater

Using these values, you need at least 1546 observations.

In your specific case there is a simple exact solution:

Under the particular null hypothesis $H_0: p=0$ you should never observe a success. So as soon as you observe one success you can be sure that $p\neq0$ .

Under the alternative $H_1: p=0.001$ The number of trials required to observe at least 1 success follows a geometric distribution. So in order to obtain the minimum sample size to achieve a power of $1-\beta$ , you need to find the smallest k such that,

1 - β \leq 1 - (1 - p)^{(k - 1)}

$1-\beta \leq 1-(1-p)^{(k-1)}$

So with $p=0.001$ to get $80%$ power you would need at least 1610 samples.

其它你可能感兴趣的问题

上一篇“Stata”或“R”中的回归不连续设计图下一篇如何“智能”分类排序数据的集合？