为什么这些数据会在 R fitdistr 中引发错误?

机器算法验证 r 配件
2022-04-08 22:41:16

我正在尝试为此使用 weibull 分布,但遇到了问题。不知道为什么。是什么导致了 NaN?

temp <- dput(temp)
c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22, 759.46, 
1142.33, 134, 1232.23, 389.81, 7811.65, 992.11, 1152.4, 3139.01, 
2636.78, 3294.75, 2266.95, 32.12, 7356.84, 1448.54, 3606.82, 
465.39, 950.5, 3721.49, 522.01, 1548.62, 2196.3, 256.8, 2959.72, 
214.4, 134, 2307.79, 2112.74)


fitdist(temp, distr = "weibull", method = "mle")
Fitting of the distribution ' weibull ' by maximum likelihood 
Parameters:
          estimate  Std. Error
shape    0.8949019   0.1205351
scale 1803.8816283 357.9042207
Warning messages:
1: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
2: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
3: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
4: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
5: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
6: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
7: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
8: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
9: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
10: In dweibull(c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22,  :
  NaNs produced
1个回答

Weibull 分布有两个参数,即尺度λ和形状k(我遵循维基百科的符号)。两个参数都是正实数。

包中的函数使用该fitdist函数来查找参数的最大似然估计。默认情况下,对参数不施加任何限制,并且也会尝试负数。但是比例或形状的负值会产生Weibull 分布。通过使用选项,您可以对 的参数搜索空间施加限制fitdistrplusoptimoptimNaNslowerupperoptim

伽马分布也有两个参数,与 Weibull 分布一样,它们都是正的。所以同样的限制lower = c(0, 0)可以用于伽马分布。

编辑

这是发布数据的 Weibull 和 gamma 拟合的小比较。伽马分布的错误是由于错误的起始值引起的。我手动提供它们,然后它可以正常工作而没有错误。

library(fitdistrplus)

temp <- c(477.25, 2615.56, 1279.98, 581.57, 13.55, 80.4, 6640.22, 759.46, 
          1142.33, 134, 1232.23, 389.81, 7811.65, 992.11, 1152.4, 3139.01, 
          2636.78, 3294.75, 2266.95, 32.12, 7356.84, 1448.54, 3606.82, 
          465.39, 950.5, 3721.49, 522.01, 1548.62, 2196.3, 256.8, 2959.72, 
          214.4, 134, 2307.79, 2112.74)

fit.weibull <- fitdist(temp, distr = "weibull", method = "mle", lower = c(0, 0))
fit.gamma <- fitdist(temp, distr = "gamma", method = "mle", lower = c(0, 0), start = list(scale = 1, shape = 1))

绘制 Weibull 的拟合:

plot(fit.weibull)

威布尔拟合

对于伽马分布:

plot(fit.gamma)

伽玛拟合

它们实际上无法区分。两种拟合的 AIC 几乎相同:

gofstat(list(fit.weibull, fit.gamma))

Goodness-of-fit statistics
                             1-mle-weibull 2-mle-gamma
Kolmogorov-Smirnov statistic    0.07288424  0.07970184
Cramer-von Mises statistic      0.02532353  0.02361358
Anderson-Darling statistic      0.20489012  0.17609146

Goodness-of-fit criteria
                               1-mle-weibull 2-mle-gamma
Aikake's Information Criterion      601.7909    601.5659
Bayesian Information Criterion      604.9016    604.6766