机器算法验证 - 识别校准曲线之间的差异：ANCOVA？ - 吾爱随笔录

识别校准曲线之间的差异：ANCOVA？

机器算法验证 r 安乔娃

2022-03-30 16:45:00

我有大量的校准曲线（即仪器响应与分析物浓度），我想检查它们是否彼此不同。

恰好一条曲线与两个因素（天和治疗）水平的每个独特组合相关联。因此，在 R 中，数据如下所示：

set.seed(0)
day = gl(2, 6, labels=c("day1", "day2"))
treat = gl(2, 3, 12, labels=c("A", "B"))
conc = rep(0:2, 4)
response = conc + rnorm(12)/2
df <- data.frame(day=day, treat=treat, conc=conc, response=response)

我想知道一个或两个因素是否会影响校准曲线的斜率（我不太关心截距）。如果它们不同，我将为每个相应的数据集使用不同的校准曲线；如果它们没有不同，我将对它们进行平均以提高基于校准曲线的测量精度。

使用 ANCOVA 来测试斜率之间的差异是否正确，使用day和的交互作用treat？如果是这样，是正确的 R 代码

model <- lm(response ~ conc + day:treat)

编辑：要清楚：与每条校准曲线一起，我已经测量了数据，我将使用校准曲线对其进行校准。我试图了解我是否应该为每个时间点和治疗使用单独的校准曲线（这将提高准确性，如果时间点和/或治疗有影响）或者是否可以接受平均校准曲线（这将改善精确）。因此，我想知道治疗、时间点或治疗和时间点是否会影响校准曲线的斜率。csv文件的实际数据在这里（未来天数会增加）。

EDIT2：感谢所有参与以下对话的人。为了澄清，我试图了解什么模型会告诉我响应与浓度曲线的斜率是否存在显着差异，无论是一天还是治疗或两者。查看我在此处发布的实际数据，而不是我为简单起见生成的假数据，这可能是有益的。

require(ggplot2)
df <- read.csv("http://dl.dropbox.com/u/48901983/IntrxnDataSo.csv")
ggplot(df, aes(x=conc, y=response, colour=day)) + 
       geom_point() + geom_line() +
       facet_wrap(~treatment)

calib_by_treatment

因此，看起来一天对 response-vs-conc 图的斜率有影响。

ggplot(df, aes(x=conc, y=response, colour=treatment)) + 
       geom_point() + geom_line() +
       facet_wrap(~day)

calib_by_day

目前尚不清楚治疗对响应-浓度图的斜率有影响。

int <- ddply(df, .(treatment, day), function(x) coefficients(lm(x$response~x$conc))[1])
names(int)[3] <- "Intercept"
ggplot(int, aes(x=treatment, y=Intercept, colour=day)) + geom_point()

在此处输入图像描述

看起来治疗对响应与浓度图的截距也有影响。

1个回答

如果你不关心拦截，我想你想要这个：

> model <- lm(response ~ -1 + conc + day:treat)
> anova(model)
Analysis of Variance Table

Response: response
          Df  Sum Sq Mean Sq F value    Pr(>F)    
conc       1 18.5545 18.5545 62.3458 9.909e-05 ***
day:treat  4  1.7860  0.4465  1.5003    0.2996    
Residuals  7  2.0832  0.2976                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> summary(model)

Call:
lm(formula = response ~ -1 + conc + day:treat)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.5736 -0.4803  0.0222  0.3465  0.6013 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)   
conc             0.6859     0.1929   3.556  0.00927 **
dayday1:treatA   0.6919     0.3693   1.873  0.10316   
dayday2:treatA   0.1093     0.3693   0.296  0.77585   
dayday1:treatB   0.3387     0.3693   0.917  0.38965   
dayday2:treatB   0.7090     0.3693   1.920  0.09636 . 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.5455 on 7 degrees of freedom
Multiple R-squared: 0.9071, Adjusted R-squared: 0.8407 
F-statistic: 13.67 on 5 and 7 DF,  p-value: 0.001695

-1中的删除lm拦截。

您的模型中没有真正的“斜率”。你只是有不同的日子和不同的治疗。在上面的结果中，您会看到交互作用并不显着。

大概这不是您的真实数据。model2 <- lm(response ~ -1 + conc + day + treat) 在您的真实数据中，您可能还想查看以下内容： model3 <- lm(response ~ -1 + conc + day*treat)

您可以使用以下代码将其可视化（非常hacky）

plot(as.numeric(day),response,col=as.numeric(treat), xlim=c(1,2.2))
points(as.numeric(day)+0.1,pred,col=as.numeric(treat),pch=2)
for(i in 1:6){lines(c(1,1.1), c(response[i],pred[i]),col=as.numeric(treat)[i])}
for(i in 7:12){lines(c(2,2.1), c(response[i],pred[i]),col=as.numeric(treat)[i])}
legend("bottom", c("treatA","treatB","observed","predicted"), col=c(1,2,1,1),
    lty=c(1,1,NA,NA), pch=c(NA,NA,1,2))

我还是堆栈交换的新手，所以如果有人添加带有链接的评论，该链接显示如何包含此图而不仅仅是代码，我会很感激。

编辑

基于此代码，生成了以下图像。你没有给出pred对象，所以我在这里假设一些事情。预测值更小且透明。

pred <- data.frame(treat = df$treat, conc = df$conc, day = df$day, response = predict(mdl, newdata = df[, 1:3]))

library(ggplot2)
ggplot(df, aes(y = response, x = treat, colour = as.factor(conc))) + 
   geom_jitter(position = position_jitter(width = 0.25), shape = 16, size = 3) +
   geom_point(data = pred, aes(shape = as.factor(conc)), alpha = 0.2) +
   facet_grid(~day) + 
   theme_bw()

在此处输入图像描述

其它你可能感兴趣的问题

上一篇我应该如何解释 Cox 比例风险模型中的交互项？下一篇使用 MAD 作为定义显着性检验阈值的方法