用于 k 截断泊松的 MLE

机器算法验证 r 最大似然 泊松分布
2022-04-04 13:12:54

我需要在 R 中实现一个算法来找到 MLEλk(截断级别:只允许大于 k 的值)用于截断泊松分布。我似乎无法在谷歌的任何地方找到这个,而且我一开始就在努力推导出 MLE。

1个回答

更新: 现在问题更具体了,我添加了左截断的对数似然和 R 代码。

右截断。 为了Yi分布为带参数的截断泊松λ和截断参数k, 的对数似然n随机样本由下式给出

log(λ)i=1nyinλnlog(j=0keλλjj!)

这是最大化的任何值λ什么时候k^=max(y1,y2,,yn)可以取的最小值)。的最大似然估计需要一个迭代过程。Moore (1952, Biometrika) 提供了一个很好且易于计算的初始估计:kλλ^

λ^0=i=1nyii=1nI(yi<k^)

其中是指示函数,如果取 1 ,否则取 0。(所以这是一个向上调整的平均值。)I()yi<k^

一些不那么优雅的 R 代码如下:

# MLE estimation of a truncated Poisson with unknown truncation level

# Objective is to find estimates of lambda (underlying Poisson mean) and
# k (unknown trunction value).

# Generate some samples from a known truncated Poisson distribution
  lambda <- 10  # Poisson mean
  k <- 8        # Truncation level
  n <- 100      # Sample size
  y <- rpois(n*4,lambda) # Oversample to be more certain of getting n samples
  # Keep just the first n legitimate observations
  y <- y[y <= k][1:n]

# MLE for k and initial estimate for lambda
# From Moore (1952)  Biometrika
# "The estimation of the Poisson parameter from a truncated distribution"
  khat <- max(y)
  lambda0 <- sum(y)/length(y[y < khat])

# Define log of the likelihood function
  logL <- function(lambda,ysum,n,k) {
    log(lambda)*ysum - n*lambda - n*ppois(k,lambda,log.p=TRUE)
  }

# Find maximum likelihood estimate of lambda  
  trPoisson <- optim(lambda0, logL, ysum=sum(y), n=length(y), k=khat, 
    method="BFGS", control=list(fnscale=-1))

# Show results
  cat("MLE of truncation value =",khat,"\n")
  # MLE of truncation value = 8 
  cat("MLE of lambda =",trPoisson$par,"\n")
  # MLE of lambda = 9.765054 

左截断。 对于左截断,对数似然如下:

log(λ)i=1nyinλnlog(1j=0k1eλλjj!)

时,最右边的总和为零下面是一些R代码:k<0

# MLE estimation of a left-truncated Poisson with unknown truncation level

# Objective is to find estimates of lambda (underlying Poisson mean) and
# k (unknown trunction value).

# Generate some samples from a known truncated Poisson distribution
  lambda <- 10   # Poisson mean
  k <- 7         # Truncation level
  n <- 100       # Sample size
  y <- rpois(n*30,lambda) # Oversample to be more certain of getting n samples
  y <- y[y >= k][1:n]

# MLE for k and initial estimate for lambda
  khat <- min(y)
  lambda0 <- mean(y)  # Probably any starting value greater than zero will work

# Define log of the likelihood function
  logL <- function(lambda,ysum,n,k) {
       log(lambda)*ysum - n*lambda - n*log(1-ppois(k-1,lambda))
  }

# Find maximum likelihood estimate of lambda  
  trPoisson <- optim(lambda0, logL, ysum=sum(y), n=length(y), k=khat, 
    method="L-BFGS-B", lower=0, upper=Inf, control=list(fnscale=-1))

# Show results
  cat("MLE of truncation value =",khat,"\n")
  cat("MLE of lambda =",trPoisson$par,"\n")