如果您有介于 0 和 1 之间的“连续”(看起来,因为它们仍然可能是离散的)值,则至少有两种情况:
- 它们来自许多独立的二元试验,“连续”值是成功次数除以试验。那么二项式 GLM 可能是合适的。在这种情况下,您需要将其安装在 R 中
glm(cbind(numberSuccesses,numberFailures)~x,family=binomial)
- 如果不是这种情况,那么您可能拥有更适合Beta 模型的东西。我提供的链接显示了如何在 R 中执行此操作。
请注意,在 R 中glm(y~x,family=binomial)
,带有“连续”将引发警告,并且通常结果与成功和试验次数不同的情况:y
set.seed(1)
successes<-sample(1:10,100,replace=TRUE)
x<-1:100
n<-12
failures<-n-successes
summary(glm(cbind(successes,failures)~x,family=binomial))
Call:
glm(formula = cbind(successes, failures) ~ x, family = binomial)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8197 -0.9434 0.0454 0.9358 2.4921
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.24622 0.11349 -2.17 0.03 *
x 0.00080 0.00195 0.41 0.68
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 134.99 on 99 degrees of freedom
Residual deviance: 134.82 on 98 degrees of freedom
AIC: 422.2
Number of Fisher Scoring iterations: 3
但
props<-successes/n
summary(glm(props~x,family=binomial))
Call:
glm(formula = props ~ x, family = binomial)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.852 -0.282 -0.105 0.394 0.760
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.134339 0.403836 -0.33 0.74
x 0.000281 0.006941 0.04 0.97
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 20.888 on 99 degrees of freedom
Residual deviance: 20.887 on 98 degrees of freedom
AIC: 141.3
Number of Fisher Scoring iterations: 3
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!