机器算法验证 - 将单个样本与平均值进行比较的置换检验 - 吾爱随笔录

当人们实施置换检验以将单个样本与均值进行比较时（例如，您可能使用置换 t 检验），均值是如何处理的？我已经看到了采用平均值和样本进行置换测试的实现，但不清楚它们实际上在做什么。是否有一种有意义的方法可以对一个样本与假设的平均值进行置换检验（例如 t 检验）？或者，或者，他们只是默认使用引擎盖下的非排列测试？（例如，尽管调用了置换函数或设置了置换测试标志，但默认为标准 t 检验或类似函数）

在标准的两样本置换测试中，一个人会有两组并随机分配标签。但是，当一个“组”是假定的平均值时，如何处理？显然，假设平均值本身没有样本量。那么，将均值转化为排列格式的典型方法是什么？“平均”样本是否假定为单点？与样本组大小相等的样本？一个无限大的样本？

假设一个假设的平均值是假设的——我会说它在技术上具有无限的支持，或者你想为它假设的任何支持。但是，这些对于实际计算都不是很有用。一个大小相等的样本，其值都等于平均值，有时似乎是通过某些测试完成的（例如，您只需用假定的位置填充另一半对）。这有点道理，因为如果假设的均值正确且没有方差，您会看到它是等长样本。

所以我的问题是：在实践中，当第二组是平均值（或类似的抽象假设值）时，人们是否真的在模拟排列测试样式标签随机化？如果是这样，人们在这样做时如何处理标签随机化？

function (x, nsim) { ## Initialize and pre-allocate n <- length(x) dbar <- mean(x) absx <- abs(x) # there's actually a bug in the code; below you'll see that the function ends up re-computing abs(x) instead of using this z <- array(, nsim) ## Run the simulation for (i in 1:nsim) { # Do nsim times: mn <- sample(c(-1, 1), n, replace = TRUE) # 1. take n random draws from {-1, 1}, where n is the length of the data to be tested xbardash <- mean(mn * abs(x)) # 2. assign the signs to the data and put them in a temporary variable z[i] <- xbardash # 3. save the new data in an array } ## Return the p value # p = the fraction of fake data that is: # larger than |sample mean of x|, or # smaller than -|sample mean of x| (sum(z >= abs(dbar)) + sum(z <= -abs(dbar)))/nsim }