我有三个预测变量和一个响应。如果我的响应变量是二进制的,我该怎么办?
我需要具有二进制响应的 GLM 中的二次和线性系数。最好的选择是什么?
机器算法验证
r
物流
广义线性模型
二进制数据
2022-03-17 11:00:54
1个回答
您可以使用逻辑回归添加二次项,就像使用常规的旧线性回归一样。这是在模型中包含“曲线”的简单方法。确保你明白这意味着什么。我怀疑您想要一个 R 教程,这与 CV 无关。在 R 中添加二次的基本方法是包含I(x^2)在公式中。这是一个简单的例子:
lo.to.p = function(lo){ # we need this function to generate the data
odds = exp(lo)
prob = odds/(1+odds)
return(prob)
}
set.seed(4649) # this makes the example exactly reproducible
x1 = runif(100, min=0, max=10) # you have 3, largely uncorrelated predictors
x2 = runif(100, min=0, max=10)
x3 = runif(100, min=0, max=10)
lo = -78 + 35*x1 - 3.5*(x1^2) + .1*x2 # there is a quadratic relationship w/ x1, a
p = lo.to.p(lo) # linear relationship w/ x2 & no relationship
y = rbinom(100, size=1, prob=p) # w/ x3

model = glm(y~x1+I(x1^2)+x2+x3, family=binomial)
summary(model)
# Call:
# glm(formula = y ~ x1 + I(x1^2) + x2 + x3, family = binomial)
#
# Deviance Residuals:
# Min 1Q Median 3Q Max
# -1.74280 -0.00387 0.00000 0.04145 1.74573
#
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -53.65462 19.65288 -2.730 0.00633 **
# x1 24.78164 8.92910 2.775 0.00551 **
# I(x1^2) -2.49888 0.89344 -2.797 0.00516 **
# x2 0.03318 0.20198 0.164 0.86952
# x3 -0.09277 0.18650 -0.497 0.61890
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# (Dispersion parameter for binomial family taken to be 1)
#
# Null deviance: 128.207 on 99 degrees of freedom
# Residual deviance: 18.647 on 95 degrees of freedom
# AIC: 28.647
#
# Number of Fisher Scoring iterations: 10
其它你可能感兴趣的问题