这可以建模为区间删失数据。大多数关于区间删失的参考资料都在生存时间上下文中,但相同的技术适用于任何被删失的测量。中的survival
包R
具有为此所需的方法。我建议加速失效模型。
例如
N <- 100
set.seed(123455)
X <- data.frame(
bathrooms = sample(1:3, size = N, replace = TRUE),
bedrooms = sample(1:4, size = N, replace = TRUE)
)
Y <- rlnorm(N, log(10000 + X$bathrooms*10000 + X$bedrooms*30000), 2)
hist(log10(Y))
Ycensleft <- floor(Y/10000) * 10000
Ycensright <- ceiling(Y/10000) * 10000
require(survival)
Ysurv <- Surv(time = Ycensleft, time2 = Ycensright, event = rep(3, N), type = 'interval')
sv1 <- survreg(Ysurv ~ bathrooms + bedrooms, data = X, dist = "gaussian")
summary(sv1)
```