我正在进行的实验具有以下设计:
ABCDEF
BADEFC
ABEFCD
BAFCDE
- 每个字母代表本实验分析的称为“系统”的单一因素的不同水平。该数据集包含八年,我们正在分析的因变量是yield。
A 和 B 可以根据系统类型组合在一起,C 到 F 也可以组合在一起。我知道 AB 组和 CDEF 组之间缺少随机化,这是由于法规所必需的,以及这两个组中缺少随机化,遗憾的是,这根本没有进行。 - 我正在调查系统之间的产量是否存在显着差异 (AF)
我的数据如下所示:
> str(data)
'data.frame': 192 obs. of 6 variables:
$ year : Factor w/ 8 levels "2012","2013",..: 1 1 1 1 1 1 1 1 1 1 ...
$ type : Factor w/ 2 levels "org","pest": 1 1 1 1 1 1 1 1 1 1 ...
$ system: Factor w/ 6 levels "dgst_org","cc_pest",..: 3 3 3 3 5 5 5 5 6 6 ...
$ row : Factor w/ 4 levels "row_1","row_2",..: 1 2 3 4 2 3 4 1 3 4 ...
$ column: Factor w/ 6 levels "column_1","column_2",..: 6 5 4 3 6 5 4 3 6 5 ...
$ yield : num 26.2 41.4 43.4 45 40.8 52.3 47.1 47.2 40.1 42.4 ...
> summary(data)
year type system row column yield
2012 :24 org :128 dgst_org :32 row_1:48 column_1:32 Min. : 26.20
2013 :24 pest: 64 cc_pest :32 row_2:48 column_2:32 1st Qu.: 52.30
2014 :24 cc_org :32 row_3:48 column_3:32 Median : 62.95
2015 :24 manure_pest:32 row_4:48 column_4:32 Mean : 73.79
2016 :24 manure_org :32 column_5:32 3rd Qu.:103.83
2017 :24 fmyd_org :32 column_6:32 Max. :127.10
> head(data,20)
year type system row column yield
377 2012 org cc_org row_1 column_6 26.2
378 2012 org cc_org row_2 column_5 41.4
379 2012 org cc_org row_3 column_4 43.4
380 2012 org cc_org row_4 column_3 45.0
417 2012 org manure_org row_2 column_6 40.8
418 2012 org manure_org row_3 column_5 52.3
419 2012 org manure_org row_4 column_4 47.1
420 2012 org manure_org row_1 column_3 47.2
461 2012 org fmyd_org row_3 column_6 40.1
462 2012 org fmyd_org row_4 column_5 42.4
463 2012 org fmyd_org row_1 column_4 39.5
464 2012 org fmyd_org row_2 column_3 35.7
505 2012 org dgst_org row_4 column_6 57.8
506 2012 org dgst_org row_1 column_5 48.8
507 2012 org dgst_org row_2 column_4 52.3
508 2012 org dgst_org row_3 column_3 64.1
537 2013 org cc_org row_1 column_6 41.2
538 2013 org cc_org row_2 column_5 43.3
539 2013 org cc_org row_3 column_4 57.2
540 2013 org cc_org row_4 column_3 51.1
我试图提出一个适当的线性混合效应模型,但由于实验设计不佳而遇到了一些问题。
我知道这不是问题,只要模型的残差是正态分布的,它们是
> m1 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:column) + (1|year:row), data = data)
> summary(m1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:column) + (1 | year:row)
Data: data
REML criterion at convergence: 1262.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.2604 -0.4993 0.0596 0.5585 2.3880
Random effects:
Groups Name Variance Std.Dev.
year:column (Intercept) 0.01384 0.1176
year:system (Intercept) 43.85302 6.6222
year:row (Intercept) 2.27887 1.5096
year (Intercept) 22.30702 4.7230
Residual 26.42919 5.1409
Number of obs: 192, groups: year:column, 48; year:system, 48; year:row, 32; year, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 62.981 3.028 27.986 20.801 < 2e-16 ***
systemcc_pest 46.566 3.552 34.309 13.110 6.42e-15 ***
systemcc_org -9.744 3.552 33.574 -2.743 0.00969 **
systemmanure_pest 47.147 3.552 34.309 13.274 4.49e-15 ***
systemmanure_org -8.369 3.552 33.574 -2.356 0.02444 *
systemfmyd_org -10.722 3.552 33.574 -3.019 0.00482 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmcc_p systmcc_r systmmnr_p systmmnr_r
systmcc_pst -0.587
systemcc_rg -0.587 0.500
systmmnr_ps -0.587 0.500 0.500
systmmnr_rg -0.587 0.500 0.500 0.500
systmfmyd_r -0.587 0.500 0.500 0.500 0.500
- 然后我的第一个想法是将整个数据集分成两个数据集(AB 和 CDEF),每个数据集都具有正态分布的数据,并检查系统之间的显着差异,首先是分开,然后是一起检查。
我的 CDEF 组的 lmer 模型是:
m1 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:row) + (1|year:column))
我尝试添加一个额外的随机效应来解释行和列之间的交互,+(1|row:column)
但收到一条错误消息:boundary (singular) fit: see ?isSingular
AB 组的模型是:
m2 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:row))
因为只有单个复制的行。我检查了 emmeans 包,如果组之间存在显着差异,发现 F 之间存在显着差异,产量较高,CDE 产量较低。系统 A 和 B 之间没有发现差异。之后我不知道如何继续和比较两组。
- 我的第二个想法是添加一个考虑系统类型的分组变量,并创建一个可以同时比较整个实验的模型。
我提出的 lmer 模型是:
m3 <- lmer(yield ~ type + system + (1|year) + (1|year:system) + (1|year:type) + (1|year:row))
我又遇到了一些问题,我不知道如何正确嵌套我的固定效果,因为它们显然是嵌套的以及如何考虑列。
正如Russ Lenth在评论中提到的那样,分裂人口是没有意义的,因为这是治疗的影响
因此,我的问题是:
我是否应该划分我的数据集并分别分析两种系统类型(AB 和 CDEF),如果是这样,我如何在 AB 模型中包含列,之后我有什么可能比较 AB 和 CDEF?或者我应该制作一个模型来统治它们并为系统类型创建一个新的分组变量并正确嵌套它们并忽略列的随机效应?
或者你有任何其他想法如何处理这个设计?
新型号
> m1 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:row), data = data)
> summary(m1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:row)
Data: data
REML criterion at convergence: 1262.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.2609 -0.4988 0.0592 0.5590 2.3885
Random effects:
Groups Name Variance Std.Dev.
year:system (Intercept) 43.868 6.623
year:row (Intercept) 2.276 1.509
year (Intercept) 22.305 4.723
Residual 26.442 5.142
Number of obs: 192, groups: year:system, 48; year:row, 32; year, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 62.981 3.028 28.260 20.799 < 2e-16 ***
systemcc_pest 46.566 3.552 35.000 13.108 4.6e-15 ***
systemcc_org -9.744 3.552 35.000 -2.743 0.00954 **
systemmanure_pest 47.147 3.552 35.000 13.272 3.2e-15 ***
systemmanure_org -8.369 3.552 35.000 -2.356 0.02421 *
systemfmyd_org -10.722 3.552 35.000 -3.018 0.00472 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmcc_p systmcc_r systmmnr_p systmmnr_r
systmcc_pst -0.587
systemcc_rg -0.587 0.500
systmmnr_ps -0.587 0.500 0.500
systmmnr_rg -0.587 0.500 0.500 0.500
systmfmyd_r -0.587 0.500 0.500 0.500 0.500
> m2 <- lmer(yield ~ system + (1|year) + (1|year:row) + (1|year:column), data = data)
> summary(m2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | year) + (1 | year:row) + (1 | year:column)
Data: data
REML criterion at convergence: 1302.3
Scaled residuals:
Min 1Q Median 3Q Max
-3.0617 -0.5748 0.1023 0.5824 2.7636
Random effects:
Groups Name Variance Std.Dev.
year:column (Intercept) 27.2467 5.2198
year:row (Intercept) 0.2432 0.4932
year (Intercept) 25.0757 5.0076
Residual 38.6421 6.2163
Number of obs: 192, groups: year:column, 48; year:row, 32; year, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 62.981 2.281 12.319 27.616 1.87e-12 ***
systemcc_pest 46.566 2.229 75.612 20.889 < 2e-16 ***
systemcc_org -9.744 1.554 116.002 -6.270 6.39e-09 ***
systemmanure_pest 47.147 2.229 75.612 21.149 < 2e-16 ***
systemmanure_org -8.369 1.554 116.002 -5.385 3.84e-07 ***
systemfmyd_org -10.722 1.554 116.002 -6.899 2.93e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmcc_p systmcc_r systmmnr_p systmmnr_r
systmcc_pst -0.405
systemcc_rg -0.341 0.349
systmmnr_ps -0.405 0.757 0.349
systmmnr_rg -0.341 0.349 0.500 0.349
systmfmyd_r -0.341 0.349 0.500 0.349 0.500

