建立模拟算法来检查贝叶斯后验概率的校准

机器算法验证 贝叶斯 模拟 后部
2022-03-26 16:28:24

弄清楚如何模拟某些东西通常是理解基本原理的最佳方式。我对如何模拟以下内容有点茫然。

假设的先验分布是基于观测值缩写为,我有兴趣向非贝叶斯表明是经过良好校准的,例如 Prob其中是后验概率。相关讨论在这里YN(μ,σ2)μN(γ,τ2)nY1,,YnYμ>0|Y(μ>0|P)=PP

我真正想展示的是,如果要进行顺序测试并在后验概率超过某个水平(例如 0.95)时停止采样,则的概率不μ>0<0.95

我试图让常客相信贝叶斯概率是有意义的,而不涉及任何关于 I 类错误的讨论。我想在与接受零假设的常客交谈时存在一个哲学问题,即如果先验是连续的(如上所述),则的概率为零并且不需要模拟。对于如何思考整个问题以及如何设计演示模拟,我将不胜感激。我习惯于进行频率主义模拟,其中只是设置为单个常数;贝叶斯不以为条件。μ=0μμ

对于顺序情况,我们设置了最大可能的样本量,例如n=1000

这个问题有一个微妙之处,我总是难以思考。一个真正的怀疑论者有时会担心当过程确实没有效果()。 (?)提供了非零概率。我们证明后验已校准的方法可能不会让这样的怀疑者高兴,因为怀疑者似乎真的想以为条件,而作为贝叶斯主义者,我们只以已知的东西为条件。也许这是统计学家使用的先验分布与怀疑者使用的不连续先验分布冲突的情况?μ>0μ=0μ=0μ=0

2个回答

模拟结果将取决于参数在模拟中的采样方式。如果先验概率是,我认为后验概率是否会被校准(在频率意义上)没有任何争议,所以我怀疑模拟不会让任何人相信任何新事物。

无论如何,在问题(第三段)中提到的顺序采样情况下,可以通过从先前的情况下绘制样本直到或出现一些其他终止标准(需要另一个终止标准,因为存在运行后验概率永远不会超过的正概率)。然后,对于每个声明,检查底层采样的参数是否为正,并计算真阳性与假阳性的数量。因此,对于μμp(μ>0samples)>0.950.95p(μ>0samples)>0.95μi=1,2,

  • 样本μiN(γ,τ2)
  • 对于j=1,
    • 样本yi,jN(μi,σ2)
    • 计算pi,j:=P(μi>0yi,1:j)
    • 如果pi,j>0.95
      • 如果,增加真阳性计数器μi>0
      • 如果,增加误报计数器μi0
      • 从内部 for 循环中断
    • 其他一些中断条件,例如jjmax

真阳性与所有阳性的比率至少为,这表明声明的校准。0.95P(μ>0D)>0.95

一个缓慢而肮脏的 Python 实现(错误很可能 + 在我调试之前存在潜在的停止偏差,直到我看到预期的校准属性保持)。

# (C) Juho Kokkala 2016
# MIT License 

import numpy as np

np.random.seed(1)

N = 10000
max_samples = 50

gamma = 0.1
tau = 2
sigma = 1

truehits = 0
falsehits = 0

p_positivemus = []

while truehits + falsehits < N:
    # Sample the parameter from prior
    mu = np.random.normal(gamma, tau)

    # For sequential updating of posterior
    gamma_post = gamma
    tau2_post = tau**2

    for j in range(max_samples):
        # Sample data
        y_j = np.random.normal(mu, sigma)

        gamma_post = ( (gamma_post/(tau2_post) + y_j/(sigma**2)) /
                       (1/tau2_post + 1/sigma**2) )
        tau2_post = 1 / (1/tau2_post + 1/sigma**2)

        p_positivemu = 1 - stats.norm.cdf(0, loc=gamma_post,
                                          scale=np.sqrt(tau2_post))

        if p_positivemu > 0.95:
            p_positivemus.append(p_positivemu)
            if mu>0:
                truehits += 1
            else:
                falsehits +=1
            if (truehits+falsehits)%1000 == 0:
                print(truehits / (truehits+falsehits))
                print(truehits+falsehits)
            break

print(truehits / (truehits+falsehits))
print(np.mean(p_positivemus))

我得到了的真阳性与所有索赔的比例。这是超过因为后验概率不会恰好达到出于这个原因,代码还跟踪平均“声称的”后验概率,我得到了0.98070.950.950.9804

也可以更改每个以证明“在所有推论上”的校准(如果先验已校准)。另一方面,可以从“错误的”先前超参数(不同于绘制真实参数时使用的参数)开始执行后更新,在这种情况下,校准可能不成立。γ,τi

扩展@juho-kokkala 的出色答案并在此处使用 R 是结果。对于总体均值 mu 的先验分布,我使用了均值为 0 的两个正态的相等混合,其中一个对大均值持怀疑态度。

## Posterior density for a normal data distribution and for
## a mixture of two normal priors with mixing proportions wt and 1-wt
## and means mu1 mu2 and variances v1 an
## Adapted for LearnBayes package normal.normal.mix function

## Produces a list of 3 functions.  The posterior density and cum. prob.
## function can be called with a vector of posterior means and variances
## if the first argument x is a scalar

mixpost <- function(stat, vstat, mu1=0, mu2=0, v1, v2, wt) {
  if(length(stat) + length(vstat) != 2) stop('improper arguments')
  probs      <- c(wt, 1. - wt)
  prior.mean <- c(mu1, mu2)
  prior.var  <- c(v1,  v2)

  post.precision <- 1. / prior.var + 1. / vstat
  post.var       <- 1. / post.precision
  post.mean <- (stat / vstat + prior.mean / prior.var) / post.precision
  pwt       <- dnorm(stat, prior.mean, sqrt(vstat + prior.var))
  pwt       <- probs * pwt / sum(probs * pwt)

  dMix <- function(x, pwt, post.mean, post.var)
    pwt[1] * dnorm(x, mean=post.mean[1], sd=sqrt(post.var[1])) +
    pwt[2] * dnorm(x, mean=post.mean[2], sd=sqrt(post.var[2]))
  formals(dMix) <- z <-
    list(x=NULL, pwt=pwt, post.mean=post.mean, post.var=post.var)

  pMix <- function(x, pwt, post.mean, post.var)
    pwt[1] * pnorm(x, mean=post.mean[1], sd=sqrt(post.var[1])) +
    pwt[2] * pnorm(x, mean=post.mean[2], sd=sqrt(post.var[2]))
  formals(pMix) <- z

  priorMix <- function(x, mu1, mu2, v1, v2, wt)
    wt * dnorm(x, mean=mu1, sd=sqrt(v1)) +
    (1. - wt) * dnorm(x, mean=mu2, sd=sqrt(v2))
  formals(priorMix) <- list(x=NULL, mu1=mu1, mu2=mu2, v1=v1, v2=v2, wt=wt)
  list(priorMix=priorMix, dMix=dMix, pMix=pMix)
}

## mixposts handles the case where the posterior distribution function
## is to be evaluated at a scalar x for a vector of point estimates and
## variances of the statistic of interest
## If generates a single function

mixposts <- function(stat, vstat, mu1=0, mu2=0, v1, v2, wt) {
  post.precision1 <- 1. / v1 + 1. / vstat
  post.var1       <- 1. / post.precision1
  post.mean1      <- (stat / vstat + mu1 / v1) / post.precision1

  post.precision2 <- 1. / v2 + 1. / vstat
  post.var2       <- 1. / post.precision2
  post.mean2      <- (stat / vstat + mu2 / v2) / post.precision2

  pwt1 <- dnorm(stat, mean=mu1, sd=sqrt(vstat + v1))
  pwt2 <- dnorm(stat, mean=mu2, sd=sqrt(vstat + v2))
  pwt <- wt * pwt1 / (wt * pwt1 + (1. - wt) * pwt2)

  pMix <- function(x, post.mean1, post.mean2, post.var1, post.var2, pwt)
    pwt        * pnorm(x, mean=post.mean1, sd=sqrt(post.var1)) +
    (1. - pwt) * pnorm(x, mean=post.mean2, sd=sqrt(post.var2))
  formals(pMix) <-
    list(x=NULL, post.mean1=post.mean1, post.mean2=post.mean2,
         post.var1=post.var1, post.var2=post.var2, pwt=pwt)
 pMix
}

## Compute proportion mu > 0 in trials for
## which posterior prob(mu > 0) > 0.95, and also use a loess smoother
## to estimate prob(mu > 0) as a function of the final post prob
## In sequential analyses of observations 1, 2, ..., N, the final
## posterior prob is the post prob at the final sample size if the
## prob never exceeds 0.95, otherwise it is the post prob the first
## time it exceeds 0.95

sim <- function(N, prior.mu=0, prior.sd, wt, mucut=0, postcut=0.95,
                nsim=1000, plprior=TRUE) {
  prior.mu <- rep(prior.mu, length=2)
  prior.sd <- rep(prior.sd, length=2)
  sd1 <- prior.sd[1]; sd2 <- prior.sd[2]
  v1 <- sd1 ^ 2
  v2 <- sd2 ^ 2
  if(plprior) {
    pdensity <- mixpost(1, 1, mu1=prior.mu[1], mu2=prior.mu[2],
                        v1=v1, v2=v2, wt=wt)$priorMix
    x <- seq(-3, 3, length=200)
    plot(x, pdensity(x), type='l', xlab=expression(mu), ylab='Prior Density')
    title(paste(wt, 1 - wt, 'Mixture of Zero Mean Normals\nWith SD=',
                round(sd1, 3), 'and', round(sd2, 3)))
  }
  j <- 1 : N
  Mu <- Post <- numeric(nsim)
  stopped <- integer(nsim)

  for(i in 1 : nsim) {
    # See http://stats.stackexchange.com/questions/70855
    component <- sample(1 : 2, size=1, prob=c(wt, 1. - wt))
    mu <- prior.mu[component] + rnorm(1) * prior.sd[component]
    # mu <- rnorm(1, mean=prior.mu, sd=prior.sd) if only 1 component

    Mu[i] <- mu
    y  <- rnorm(N, mean=mu, sd=1)
    ybar <- cumsum(y) / j
    pcdf <- mixposts(ybar, 1. / j, mu1=prior.mu[1], mu2=prior.mu[2],
                     v1=v1, v2=v2, wt=wt)
    if(i==1) print(body(pcdf))
    post    <- 1. - pcdf(mucut)
    Post[i] <- if(max(post) < postcut) post[N]
               else post[min(which(post >= postcut))]
    stopped[i] <- if(max(post) < postcut) N else min(which(post >= postcut))
  }
  list(mu=Mu, post=Post, stopped=stopped)
}

# Take prior on mu to be a mixture of two normal densities both with mean zero
# One has SD so that Prob(mu > 1) = 0.1
# The second has SD so that Prob(mu > 0.25) = 0.05
prior.sd <- c(1 / qnorm(1 - 0.1), 0.25 / qnorm(1 - 0.05))
prior.sd
set.seed(2)
z <- sim(500, prior.mu=0, prior.sd=prior.sd, wt=0.5, postcut=0.95, nsim=10000)

先验:两个正态分布的等量混合

mu   <- z$mu
post <- z$post
st   <- z$stopped
plot(mu, post)
abline(v=0, col=gray(.8)); abline(h=0.95, col=gray(.8))
hist(mu[post >= 0.95], nclass=25)
k <- post >= 0.95
mean(k)   # 0.44 of trials stopped with post >= 0.95
mean(st)  # 313 average sample size
mean(mu[k] > 0)  # 0.963 of trials with post >= 0.95 actually had mu > 0
mean(post[k])    # 0.961 mean posterior prob. when stopped early
w <- lowess(post, mu > 0, iter=0)
# perfect calibration of post probs 
plot(w, type='n',         # even if stopped early
     xlab=expression(paste('Posterior Probability ', mu > 0, ' Upon Stopping')),
     ylab=expression(paste('Proportion of Trials with ',  mu > 0)))
abline(a=0, b=1, lwd=6, col=gray(.85))
lines(w)

mu > 0 的比例与停止时的后验概率