在上一个问题中,我问过如何测试硬币是否公平。现在我想凭经验测试这个测试是否有效。
一个答案是,像 R 和 python 这样的程序具有内置的“二项式 p-tests”,可以调用它来执行此操作。
这是一些 python 代码的示例,用于对翻转 1000 个公平硬币的单个案例进行此类 p 测试:
import numpy as np
from numpy import random
import scipy
from scipy import stats
def flipAndDoPTest(numberOfFlips, weight):
flippedCoinSimulation = np.random.binomial(1, weight, numberOfFlips) #first input changes sum of coins
numberOfHeads = np.sum(flippedCoinSimulation==1)
numberOfTails = np.sum(flippedCoinSimulation==0)
pvalue = stats.binom_test(numberOfHeads, numberOfFlips, weight)
return pvalue
numberOfFlips = 1000
weight = .50
ptestvalue = flipAndDoPTest(numberOfFlips, weight)
if ptestvalue>.05:
print("the ptest has a value of:", ptestvalue)
print("The null hypothesis cannot be rejected at the 5% level of significance because the returned p-value is greater than the critical value of 5%.")
if ptestvalue<.05:
print("the ptest has a value of:", ptestvalue)
print("The null hypothesis can be rejected at the 5% level of significance because the returned p-value is less than the critical value of 5%.")
现在我想凭经验测试这个“5% 的显着性水平”是什么意思。似乎对 p 值的解释存在很多分歧,所以我只想模拟我的案例中发生的情况。
首先,我想测试一枚公平硬币是否有 5% 的概率会出现小于 0.05 的 p 值。 为此,我重复了这个 p 测试 1000 次(每个 p 测试都是针对掷硬币 10000 次的事件)。现在我收集 p 值小于 0.05 的所有时间。代码在这里:
numberOfFlips = 10000
weight = .50
numberOfTests = 1000
StatisticalSignificanceProbability = .05
pTestList = np.zeros(numberOfTests) #initialization
for i in range(numberOfTests):
#for each i in the loop, do a p-test of 10,000 fair coin flips and add it to a list
ithPTest = flipAndDoPTest(numberOfFlips, weight)
pTestList[i] = ithPTest
#take this list and count all of the times there are cases below .05
numberOfSheerCoincidences = sum(pTestList<StatisticalSignificanceProbability)
expectedNumberOfSheerCoincidences = numberOfTests*StatisticalSignificanceProbability
print("numberOfSheerCoincidences: ", numberOfSheerCoincidences)
print("expectedNumberOfSheerCoincidences: ", expectedNumberOfSheerCoincidences)
现在我预计我的 1000 个 p 测试中有 5% 将小于 0.05(因此 0.05*1000 = 50)。但是每次我运行它时,我都会得到一个明显小于 50 的数字。现在这个结果有一个随机分布,所以我然后编写代码来重复这个过程以获得结果数据的直方图分布:
numberOfFlips = 100
weight = .50
numberOfDataPoints = 1000
pTestResultsDataPoints = np.zeros(numberOfDataPoints) #initialization
for j in range(numberOfDataPoints):
#repeating this collection of p-test to get a range of different values
numberOfTests = 1000
StatisticalSignificanceProbability = .05
pTestList = np.zeros(numberOfTests) #initialization
for i in range(numberOfTests):
ithPTest = flipAndDoPTest(numberOfFlips, weight)
pTestList[i] = ithPTest
numberOfSheerCoincidences = sum(pTestList<StatisticalSignificanceProbability)
expectedNumberOfSheerCoincidences = numberOfTests*StatisticalSignificanceProbability
pTestResultsDataPoints[j] = numberOfSheerCoincidences
n, bins, patches = plt.hist(pTestResultsDataPoints, 50)
plt.show()
有了这个结果,我得到了一个以 35 而不是 50 为中心的分布。
这个结果是预期的吗?我期待一个 50 左右的正态分布。