首先,您想知道总测试时间是否有统计上的显着变化。其次,如果发生了变化,哪些测试发生了变化?
这就是我要做的:在每个代码状态中计算每个时间变量的平均值。然后为每个时间变量计算其均值的标准差。这种可变性度量是您如何合并整个测试历史中的信息。
接下来使用 -tests 检查前一个代码状态的变化(零假设是当前状态的平均时间等于前一个状态的平均值)。t
主要测试只是总体时间是否发生变化,因此您无需检查联合假设,简单的检验就足够了。如果总时间发生了变化,那么我将为单独的测试计算统计量,以查看哪些测试对变化负责。tt
一个用 R 写的粗略例子:
# Hypothetical time data over 100 states:
state <- rep(1:100, each = 5)
t1 <- 1 + runif(500)
t2 <- 2 + runif(500)
t3 <- 3 + runif(500)
total_time <- t1 + t2 + t3
d <- data.frame(state, total_time, t1, t2, t3)
# Suppose current state is 100, then we want to compare it to state 99
# while taking into account information on variability based on all historical data.
# Means within historical code states:
d_means <- aggregate(d, by=list(d$state), mean)
# Standard deviation of means up to current state:
d_stdev <- sapply(d_means[d_means$state < 100, ], sd)
# Central limit theorem tells us means are approx. normally distributed.
# So we can use t-tests.
# Test if total testing time has changed in current state:
previous <- subset(d_means, state == 99)
current <- subset(d_means, state == 100)
t_total_time <- (current$total_time - previous$total_time) / d_stdev[['total_time']]
# Now, for example, if abs(t_total_time) > 1.96, then time change is statistically
# significant at roughly 5% level.
# Check each test to see which ones have statistically significant change from
# previous state:
t_test_1 <- (current$t1 - previous$t1) / d_stdev[['t1']]
t_test_2 <- (current$t2 - previous$t2) / d_stdev[['t2']]
t_test_3 <- (current$t3 - previous$t3) / d_stdev[['t3']]
print(t_total_time)
print(t_test_1)
print(t_test_2)
print(t_test_3)