rpart我对对象摘要中的 CP 计算有点困惑。
举这个例子
df <- data.frame(x=c(1, 2, 3, 3, 3),
y=factor(c("a", "a", "b", "a", "b")),
method="class")
mytree<-rpart(y ~ x, data = df, minbucket = 1, minsplit=1)
summary(mytree)
Call:
rpart(formula = y ~ x, data = df, minbucket = 1, minsplit = 1)
n= 5
CP nsplit rel error xerror xstd
1 0.50 0 1.0 1 0.5477226
2 0.01 1 0.5 2 0.4472136
Variable importance
x
100
Node number 1: 5 observations, complexity param=0.5
predicted class=a expected loss=0.4 P(node) =1
class counts: 3 2
probabilities: 0.600 0.400
left son=2 (2 obs) right son=3 (3 obs)
Primary splits:
x < 2.5 to the left, improve=1.066667, (0 missing)
对于根节点,我认为 CP 应该是 0.4,因为对根中的元素进行错误分类的概率是 0.4,而根处的树大小是 0。0.5 是正确的 CP 吗?