假设参加婚礼的受邀者的决定是独立的,那么将参加婚礼的客人数量可以建模为成功概率不一定相同的伯努利随机变量之和。这对应于泊松二项分布。
设是一个随机变量,对应于个受邀者中将参加您的婚礼的总人数。参与者的预期数量确实是个人“出现”概率的总和,即
鉴于概率质量函数
的形式,置信区间的推导并不简单。然而,它们很容易用蒙特卡洛模拟来近似。XNpi
E(X)=∑i=1Npi.
下图显示了基于 10000 个模拟场景(右)使用 230 名受邀者(左)的一些虚假出现概率的婚礼参与者数量分布示例。用于运行此模拟的 R 代码如下所示;它提供了置信区间的近似值。
## Parameters
N <- 230 # Number of potential guests
nb.sim <- 10000 # Number of simulations
## Create example of groups of guests with same show-up probability
set.seed(345)
tmp <- hist(rbeta(N, 3, 2), breaks = seq(0, 1, length.out = 21))
p <- tmp$breaks[-1] # Group show-up probabilities
n <- tmp$counts # Number of person per group
## Generate number of guests by group
guest.mat <- matrix(NA, nrow = nb.sim, ncol = length(p))
for (j in 1:length(p)) {
guest.mat[, j] <- rbinom(nb.sim, n[j], p[j])
}
## Number of guest per scenario
nb.guests <- apply(guest.mat, 1, sum)
## Result summary
par(mfrow = c(1, 2))
barplot(n, names.arg = p, xlab = "Probability group", ylab = "Group size")
hist(nb.guests, breaks = 21, probability = TRUE, main = "", xlab = "Guests")
par(mfrow = c(1, 1))
## Theoretical mean and variance
c(sum(n * p), sum(n * p * (1-p)))
#[1] 148.8500 43.8475
## Sample mean and variance
c(mean(nb.guests), var(nb.guests))
#[1] 148.86270 43.23657
## Sample quantiles
quantile(nb.guests, probs = c(0.01, 0.05, 0.5, 0.95, 0.99))
#1% 5% 50% 95% 99%
#133.99 138.00 149.00 160.00 164.00