α和是相关的。我将尝试通过诊断测试来说明这一点。假设您有一个测量血液标志物水平的诊断测试。众所周知,与健康人相比,患有某种疾病的人的这种标志物水平较低。很明显,您必须确定一个临界值,低于该临界值的人被归类为“生病”,而高于该临界值的人被认为是健康的。然而,即使在病人和健康人中,血液标志物的分布也很可能有很大差异。一些健康人的血液标志物水平可能非常低,即使他们非常健康。β
有四种可能发生:
- 一个病人被正确识别为病人(真阳性 = TP)
- 病人被错误地归类为健康人(假阴性 = FN)
- 一个健康的人被正确识别为健康(真阴性 = TN)
- 一个健康的人被错误地归类为有病(假阳性 = FP)
这些可能性可以用2x2 表来说明:
Sick Healthy
Test positive TP FP
Test negative FN TN
α表示误报率,即。是假阴性率,即。我写了一个简单的脚本以图形方式说明情况。α=FP/(FP+TN)ββ=FN/(TP+FN)R
alphabeta <- function(mean.sick=100, sd.sick=10,
mean.healthy=130, sd.healthy=10, cutoff=120, n=10000,
side="below", do.plot=TRUE) {
popsick <- rnorm(n, mean=mean.sick, sd=sd.sick)
pophealthy <- rnorm(n, mean=mean.healthy, sd=sd.healthy)
if ( side == "below" ) {
truepos <- length(popsick[popsick <= cutoff])
falsepos <- length(pophealthy[pophealthy <= cutoff])
trueneg <- length(pophealthy[pophealthy > cutoff])
falseneg <- length(popsick[popsick > cutoff])
} else if ( side == "above" ) {
truepos <- length(popsick[popsick >= cutoff])
falsepos <- length(pophealthy[pophealthy >= cutoff])
trueneg <- length(pophealthy[pophealthy < cutoff])
falseneg <- length(popsick[popsick < cutoff])
}
twotable <- matrix(c(truepos, falsepos, falseneg, trueneg),
2, 2, byrow=T)
rownames(twotable) <- c("Test positive", "Test negative")
colnames(twotable) <- c("Sick", "Healthy")
spec <- twotable[2, 2]/(twotable[2, 2] + twotable[1, 2])
alpha <- 1 - spec
sens <- pow <- twotable[1, 1]/(twotable[1, 1] +
twotable[2, 1])
beta <- 1 - sens
pos.pred <- twotable[1,1]/(twotable[1,1] + twotable[1,2])
neg.pred <- twotable[2,2]/(twotable[2,2] + twotable[2,1])
if ( do.plot == TRUE ) {
dsick <- density(popsick)
dhealthy <- density(pophealthy)
par(mar=c(5.5, 4, 0.5, 0.5))
plot(range(c(dsick$x, dhealthy$x)), range(c(c(dsick$y,
dhealthy$y))), type = "n", xlab="", ylab="",
axes=FALSE)
box()
axis(1, at=mean(pophealthy),
lab=substitute(mu[H[0]]~paste("=", m, sep=""),
list(m=mean.healthy)), cex.axis=1.5, tck=0.02)
axis(1, at=mean(popsick),
lab=substitute(mu[H[1]] ~ paste("=", m, sep=""),
list(m=mean.sick)), cex.axis=1.5, tck=0.02)
axis(1, at=cutoff, lab=substitute(italic(paste("Cutoff=",
coff, sep="")), list(coff=cutoff)), pos=-0.004,
tick=FALSE, cex.axis=1.25)
lines(dhealthy, col = "steelblue", lwd=2)
if ( side == "below" ) {
polygon(c(cutoff, dhealthy$x[dhealthy$x<=cutoff],
cutoff), c(0, dhealthy$y[dhealthy$x<=cutoff],0),
col = "grey65")
} else if ( side == "above" ) {
polygon(c(cutoff, dhealthy$x[dhealthy$x>=cutoff],
cutoff), c(0, dhealthy$y[dhealthy$x>=cutoff],0),
col = "grey65")
}
lines(dsick, col = "red", lwd=2)
if ( side == "below" ) {
polygon(c(cutoff, dsick$x[dsick$x>cutoff], cutoff),
c(0, dsick$y[dsick$x>cutoff], 0) , col="grey90")
} else if ( side == "above" ) {
polygon(c(cutoff, dsick$x[dsick$x<=cutoff], cutoff),
c(0, dsick$y[dsick$x<=cutoff],0) , col="grey90")
}
legend("topleft",
legend=
(c(as.expression(substitute(alpha~paste("=", a),
list(a=round(alpha,3)))),
as.expression(substitute(beta~paste("=", b),
list(b=round(beta,3)))))), fill=c("grey65",
"grey90"), cex=1.2, bty="n")
abline(v=mean(popsick), lty=3)
abline(v=mean(pophealthy), lty=3)
abline(v=cutoff, lty=1, lwd=1.5)
abline(h=0)
}
#list(specificity=spec, sensitivity=sens, alpha=alpha, beta=beta, power=pow, positiv.predictive=pos.pred, negative.predictive=neg.pred)
c(alpha, beta)
}
让我们看一个例子。我们假设病人血液标志物的平均水平为 100,标准差为 10。在健康人中,平均血液水平为 140,标准差为 15。临床医生将临界值设置为 120。
alphabeta(mean.sick=100, sd.sick=10, mean.healthy=140,
sd.healthy=15, cutoff=120, n=100000, do.plot=TRUE,
side="below")
Sick Healthy
Test positive 9764 901
Test negative 236 9099
您会看到阴影区域彼此相关。在这种情况下,和。但是,如果临床医生以不同的方式设置截止值会怎样?让我们把它设置得低一点,到 105 看看会发生什么。α=901/(901+9099)≈0.09β=236/(236+9764)≈0.024
Sick Healthy
Test positive 6909 90
Test negative 3091 9910
我们的现在非常低,因为几乎没有健康人被诊断为生病。但是我们的增加了,因为血液标志物水平高的病人现在被错误地归类为健康人。αβ
最后,让我们看看和对于不同的截止值是如何变化的:αβ
cutoffs <- seq(0, 200, by=0.1)
cutoff.grid <- expand.grid(cutoffs)
plot.frame <- apply(cutoff.grid, MARGIN=1, FUN=alphabeta,
mean.sick=100, sd.sick=10, mean.healthy=140,
sd.healthy=15, n=100000, do.plot=FALSE, side="below")
plot(plot.frame[1,] ~ cutoffs, type="l", las=1,
xlab="Cutoff value", ylab="Alpha/Beta", lwd=2,
cex.axis=1.5, cex.lab=1.2)
lines(plot.frame[2,]~cutoffs, col="steelblue", lty=2, lwd=2)
legend("topleft", legend=c(expression(alpha),
expression(beta)), lwd=c(2,2),lty=c(1,2), col=c("black",
"steelblue"), bty="n", cex=1.2)
您可以立即看到和的比率不是恒定的。同样非常重要的是效果大小。在这种情况下,这将是病人和健康人血液标志物水平平均值的差异。差异越大,两组越容易被截断分开:αβ
在这里,我们有一个“完美”的测试,因为 150 的截止值可以区分病人和健康人。
Bonferroni 调整
Bonferroni 调整减少了误差,但扩大了 II 型误差 ( )。这意味着做出错误否定决定的错误会增加,而错误肯定会最小化。这就是为什么 Bonferroni 调整通常被称为保守的原因。在上图中,请注意是如何增加的:它从增加到。同时,下降到。αββ0.020.31α0.090.01