实验和数据
我正在进行的实验具有以下设计:
ABCDEF
BADEFC
ABEFCD
BAFCDE
- 每个字母代表本实验分析的称为“系统”的单一因素的不同水平。该数据集包含八年,我们正在分析的因变量是yield。A 和 B 可以根据系统类型组合在一起,C 到 F 也可以组合在一起。
- 我知道 AB 组和 CDEF 组之间缺少随机化,这是由于法规所必需的,以及这两个组中缺少随机化,遗憾的是,这根本没有进行。然而,这些行可以看作是完整的块。
- 我正在调查系统之间的产量是否存在显着差异 (AF)
我的数据如下所示:
> str(data)
'data.frame': 192 obs. of 6 variables:
$ year : Factor w/ 8 levels "2012","2013",..: 1 1 1 1 1 1 1 1 1 1 ...
$ type : Factor w/ 2 levels "org","pest": 1 1 1 1 1 1 1 1 1 1 ...
$ system: Factor w/ 6 levels "dgst_org","cc_pest",..: 3 3 3 3 5 5 5 5 6 6 ...
$ row : Factor w/ 4 levels "row_1","row_2",..: 1 2 3 4 2 3 4 1 3 4 ...
$ column: Factor w/ 6 levels "column_1","column_2",..: 6 5 4 3 6 5 4 3 6 5 ...
$ yield : num 26.2 41.4 43.4 45 40.8 52.3 47.1 47.2 40.1 42.4 ...
> summary(data)
year type system row column yield
2012 :24 org :128 dgst_org :32 row_1:48 column_1:32 Min. : 26.20
2013 :24 pest: 64 cc_pest :32 row_2:48 column_2:32 1st Qu.: 52.30
2014 :24 cc_org :32 row_3:48 column_3:32 Median : 62.95
2015 :24 manure_pest:32 row_4:48 column_4:32 Mean : 73.79
2016 :24 manure_org :32 column_5:32 3rd Qu.:103.83
2017 :24 fmyd_org :32 column_6:32 Max. :127.10
> head(data)
year type system row column yield
377 2012 org cc_org row_1 column_6 26.2
378 2012 org cc_org row_2 column_5 41.4
379 2012 org cc_org row_3 column_4 43.4
380 2012 org cc_org row_4 column_3 45.0
417 2012 org manure_org row_2 column_6 40.8
418 2012 org manure_org row_3 column_5 52.3
419 2012 org manure_org row_4 column_4 47.1
420 2012 org manure_org row_1 column_3 47.2
461 2012 org fmyd_org row_3 column_6 40.1
462 2012 org fmyd_org row_4 column_5 42.4
我之前的尝试
- 我的第一个模型是根据Piepho 和 Edmondson (2018)的教程创建的:
m1 <- lmer(yield ~ system + (1|year/row) + (1|year:system)
他们建议重复测量以将年份作为随机效应包括重复的嵌套效应(行)以及与主效应系统的交互 - 我还研究了一个模型,其中年份是固定效应,因为我也对年份的差异和每年的系统差异感兴趣:
m2 <- lmer(yield ~ system * year + (1|row), data = data)
- 我比较了这两个模型,检查了它们的摘要,并对函数进行了事后测试,得出了不同的
emmeans()
结果。- m1的 std.err 要高得多。与m2相比,因此在系统的成对比较中发现的差异较小
- m1的 AIC 较高,但 BIC 比m2低
- 两个模型的残差图和QQ图看起来都很好
- 我假设性病的高增长。呃。添加
(1|year:system)
随机交互后,与基线模型相比,这与两种系统类型m0
之间的巨大产量差异有关,因此我尝试通过添加变量来解释这一点。 我将它添加为与 year 的随机交互,因为我希望它是一个随机效果,但只有两个级别,我无法将它添加为单个随机效果:type
m3 <- lmer(yield ~ system + (1|year/row) + (1|year:system) + (1|year:type))
- 现在再次比较模型,以及他们的事后测试,我注意到:
- 标准。呃。的m3在系统类型中有所不同,因此在像m1这样的系统的成对比较中得到了类似的结果
- m1的 AIC 仍然最低,m3的 AIC低于m2
我的问题
我现在不确定该选择哪个模型,我担心m2的交互项掩盖了同一类型系统之间的差异
(1|year:system)
,尤其是org类型的系统(实验设计中的 ABCD)。m1似乎是一个很好的模型,但具有固定效应,而m3满足所有要求并很好地检测到相同类型的系统之间的差异(因为 post hoc 测试中系统类型的 std.err. 不同)
(1|year:type)
但是考虑到两种系统类型的巨大产量差异,添加这种随机效应是否合法
添加摘要
#The models
> m0 <- lmer(yield ~ system + (1|row), data = data)
> m1 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:row), data = data)
> m2 <- lmer(yield ~ system * year + (1|row), data = data)
> m3 <- lmer(yield ~ system + (1|year) + (1|year:system) + (1|year:type) + (1|year:row), data = data)
#Model Compairison
> anova(m0,m1,m2,m3)
refitting model(s) with ML (instead of REML)
Data: data
Models:
m0: yield ~ system + (1 | row)
m1: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:row)
m3: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:type) + (1 | year:row)
m2: yield ~ system * year + (1 | row)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
m0 8 1414.6 1440.7 -699.30 1398.6
m1 10 1305.3 1337.9 -642.67 1285.3 113.26 2 < 2.2e-16 ***
m3 11 1283.7 1319.6 -630.86 1261.7 23.61 1 1.180e-06 ***
m2 50 1215.6 1378.5 -557.80 1115.6 146.13 39 2.681e-14 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#Model Summaries
> summary(m0)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | row)
Data: data
REML criterion at convergence: 1380.9
Scaled residuals:
Min 1Q Median 3Q Max
-2.9797 -0.7010 0.0885 0.6564 3.1912
Random effects:
Groups Name Variance Std.Dev.
row (Intercept) 2.503 1.582
Residual 86.550 9.303
Number of obs: 192, groups: row, 4
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 53.2375 1.8250 26.7856 29.172 < 2e-16 ***
systemcc_pest 56.3094 2.3258 183.0000 24.211 < 2e-16 ***
systemdgst_org 9.7438 2.3258 183.0000 4.189 4.35e-05 ***
systemfmyd_org -0.9781 2.3258 183.0000 -0.421 0.675
systemmanure_org 1.3750 2.3258 183.0000 0.591 0.555
systemmanure_pest 56.8906 2.3258 183.0000 24.461 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmc_ systmd_ systmf_ systmmnr_r
systmcc_pst -0.637
systmdgst_r -0.637 0.500
systmfmyd_r -0.637 0.500 0.500
systmmnr_rg -0.637 0.500 0.500 0.500
systmmnr_ps -0.637 0.500 0.500 0.500 0.500
> summary(m1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:row)
Data: data
REML criterion at convergence: 1262.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.2609 -0.4988 0.0592 0.5590 2.3885
Random effects:
Groups Name Variance Std.Dev.
year:system (Intercept) 43.868 6.623
year:row (Intercept) 2.276 1.509
year (Intercept) 22.305 4.723
Residual 26.442 5.142
Number of obs: 192, groups: year:system, 48; year:row, 32; year, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 53.2375 3.0281 28.2596 17.581 < 2e-16 ***
systemcc_pest 56.3094 3.5524 34.9998 15.851 < 2e-16 ***
systemdgst_org 9.7438 3.5524 34.9998 2.743 0.00954 **
systemfmyd_org -0.9781 3.5524 34.9998 -0.275 0.78467
systemmanure_org 1.3750 3.5524 34.9998 0.387 0.70105
systemmanure_pest 56.8906 3.5524 34.9998 16.015 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmc_ systmd_ systmf_ systmmnr_r
systmcc_pst -0.587
systmdgst_r -0.587 0.500
systmfmyd_r -0.587 0.500 0.500
systmmnr_rg -0.587 0.500 0.500 0.500
systmmnr_ps -0.587 0.500 0.500 0.500 0.500
> summary(m2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system * year + (1 | row)
Data: data
REML criterion at convergence: 944.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.5152 -0.5168 0.0678 0.5333 2.5714
Random effects:
Groups Name Variance Std.Dev.
row (Intercept) 3.787 1.946
Residual 24.931 4.993
Number of obs: 192, groups: row, 4
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 39.000 2.679 79.240 14.555 < 2e-16 ***
systemcc_pest 77.475 3.531 141.000 21.943 < 2e-16 ***
systemdgst_org 16.750 3.531 141.000 4.744 5.08e-06 ***
systemfmyd_org 0.425 3.531 141.000 0.120 0.904359
systemmanure_org 7.850 3.531 141.000 2.223 0.027782 *
systemmanure_pest 73.775 3.531 141.000 20.895 < 2e-16 ***
year2013 9.200 3.531 141.000 2.606 0.010152 *
year2014 11.850 3.531 141.000 3.356 0.001015 **
year2015 0.525 3.531 141.000 0.149 0.882006
year2016 20.250 3.531 141.000 5.735 5.70e-08 ***
year2017 21.350 3.531 141.000 6.047 1.26e-08 ***
year2018 37.575 3.531 141.000 10.642 < 2e-16 ***
year2019 13.150 3.531 141.000 3.724 0.000282 ***
systemcc_pest:year2013 -14.950 4.993 141.000 -2.994 0.003252 **
systemdgst_org:year2013 3.350 4.993 141.000 0.671 0.503368
systemfmyd_org:year2013 6.175 4.993 141.000 1.237 0.218255
systemmanure_org:year2013 1.975 4.993 141.000 0.396 0.693040
systemmanure_pest:year2013 -10.450 4.993 141.000 -2.093 0.038152 *
systemcc_pest:year2014 -15.325 4.993 141.000 -3.069 0.002575 **
systemdgst_org:year2014 4.300 4.993 141.000 0.861 0.390600
systemfmyd_org:year2014 5.400 4.993 141.000 1.081 0.281328
systemmanure_org:year2014 0.800 4.993 141.000 0.160 0.872937
systemmanure_pest:year2014 -13.900 4.993 141.000 -2.784 0.006110 **
systemcc_pest:year2015 -16.550 4.993 141.000 -3.315 0.001167 **
systemdgst_org:year2015 -0.725 4.993 141.000 -0.145 0.884761
systemfmyd_org:year2015 2.650 4.993 141.000 0.531 0.596442
systemmanure_org:year2015 -8.025 4.993 141.000 -1.607 0.110246
systemmanure_pest:year2015 -10.925 4.993 141.000 -2.188 0.030316 *
systemcc_pest:year2016 -22.675 4.993 141.000 -4.541 1.19e-05 ***
systemdgst_org:year2016 -13.825 4.993 141.000 -2.769 0.006383 **
systemfmyd_org:year2016 2.050 4.993 141.000 0.411 0.682016
systemmanure_org:year2016 -10.625 4.993 141.000 -2.128 0.035083 *
systemmanure_pest:year2016 -22.000 4.993 141.000 -4.406 2.07e-05 ***
systemcc_pest:year2017 -39.100 4.993 141.000 -7.831 1.05e-12 ***
systemdgst_org:year2017 -15.025 4.993 141.000 -3.009 0.003104 **
systemfmyd_org:year2017 -10.100 4.993 141.000 -2.023 0.044987 *
systemmanure_org:year2017 -9.975 4.993 141.000 -1.998 0.047668 *
systemmanure_pest:year2017 -26.750 4.993 141.000 -5.357 3.36e-07 ***
systemcc_pest:year2018 -49.825 4.993 141.000 -9.979 < 2e-16 ***
systemdgst_org:year2018 -20.625 4.993 141.000 -4.131 6.17e-05 ***
systemfmyd_org:year2018 -13.250 4.993 141.000 -2.654 0.008877 **
systemmanure_org:year2018 -19.025 4.993 141.000 -3.810 0.000207 ***
systemmanure_pest:year2018 -47.400 4.993 141.000 -9.493 < 2e-16 ***
systemcc_pest:year2019 -10.900 4.993 141.000 -2.183 0.030691 *
systemdgst_org:year2019 -13.500 4.993 141.000 -2.704 0.007701 **
systemfmyd_org:year2019 -4.150 4.993 141.000 -0.831 0.407299
systemmanure_org:year2019 -6.925 4.993 141.000 -1.387 0.167660
systemmanure_pest:year2019 -3.650 4.993 141.000 -0.731 0.465990
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation matrix not shown by default, as p = 48 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
> summary(m3)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: yield ~ system + (1 | year) + (1 | year:system) + (1 | year:type) + (1 | year:row)
Data: data
REML criterion at convergence: 1241.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.3528 -0.5194 0.0820 0.5278 2.5522
Random effects:
Groups Name Variance Std.Dev.
year:system (Intercept) 12.001 3.464
year:row (Intercept) 2.254 1.501
year:type (Intercept) 50.459 7.103
year (Intercept) 0.000 0.000
Residual 26.453 5.143
Number of obs: 192, groups: year:system, 48; year:row, 32; year:type, 16; year, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 53.2375 2.9504 20.2592 18.044 5.93e-14 ***
systemcc_pest 56.3094 4.1555 19.9300 13.550 1.62e-11 ***
systemdgst_org 9.7437 2.1572 28.3095 4.517 0.000102 ***
systemfmyd_org -0.9781 2.1572 28.3095 -0.453 0.653703
systemmanure_org 1.3750 2.1572 28.3095 0.637 0.528989
systemmanure_pest 56.8906 4.1555 19.9300 13.690 1.35e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) systmc_ systmd_ systmf_ systmmnr_r
systmcc_pst -0.704
systmdgst_r -0.366 0.260
systmfmyd_r -0.366 0.260 0.500
systmmnr_rg -0.366 0.260 0.500 0.500
systmmnr_ps -0.704 0.865 0.260 0.260 0.260
convergence code: 0
boundary (singular) fit: see ?isSingular
#The Post Hoc Tests
> emmeans(m0, list(pairwise ~ system), adjust = "tukey")
$`emmeans of system`
system emmean SE df lower.CL upper.CL
cc_org 53.2 1.82 26.8 48.1 58.4
cc_pest 109.5 1.82 26.8 104.4 114.7
dgst_org 63.0 1.82 26.8 57.8 68.2
fmyd_org 52.3 1.82 26.8 47.1 57.4
manure_org 54.6 1.82 26.8 49.4 59.8
manure_pest 110.1 1.82 26.8 104.9 115.3
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Conf-level adjustment: sidak method for 6 estimates
$`pairwise differences of system`
contrast estimate SE df t.ratio p.value
cc_org - cc_pest -56.309 2.33 183 -24.211 <.0001
cc_org - dgst_org -9.744 2.33 183 -4.189 0.0006
cc_org - fmyd_org 0.978 2.33 183 0.421 0.9983
cc_org - manure_org -1.375 2.33 183 -0.591 0.9915
cc_org - manure_pest -56.891 2.33 183 -24.461 <.0001
cc_pest - dgst_org 46.566 2.33 183 20.021 <.0001
cc_pest - fmyd_org 57.288 2.33 183 24.631 <.0001
cc_pest - manure_org 54.934 2.33 183 23.620 <.0001
cc_pest - manure_pest -0.581 2.33 183 -0.250 0.9999
dgst_org - fmyd_org 10.722 2.33 183 4.610 0.0001
dgst_org - manure_org 8.369 2.33 183 3.598 0.0054
dgst_org - manure_pest -47.147 2.33 183 -20.271 <.0001
fmyd_org - manure_org -2.353 2.33 183 -1.012 0.9136
fmyd_org - manure_pest -57.869 2.33 183 -24.881 <.0001
manure_org - manure_pest -55.516 2.33 183 -23.869 <.0001
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 6 estimates
> emmeans(m1, list(pairwise ~ system), adjust = "tukey")
$`emmeans of system`
system emmean SE df lower.CL upper.CL
cc_org 53.2 3.03 28.3 44.7 61.8
cc_pest 109.5 3.03 28.3 101.0 118.1
dgst_org 63.0 3.03 28.3 54.4 71.5
fmyd_org 52.3 3.03 28.3 43.7 60.8
manure_org 54.6 3.03 28.3 46.0 63.2
manure_pest 110.1 3.03 28.3 101.6 118.7
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Conf-level adjustment: sidak method for 6 estimates
$`pairwise differences of system`
contrast estimate SE df t.ratio p.value
cc_org - cc_pest -56.309 3.55 35 -15.851 <.0001
cc_org - dgst_org -9.744 3.55 35 -2.743 0.0919
cc_org - fmyd_org 0.978 3.55 35 0.275 0.9998
cc_org - manure_org -1.375 3.55 35 -0.387 0.9988
cc_org - manure_pest -56.891 3.55 35 -16.015 <.0001
cc_pest - dgst_org 46.566 3.55 35 13.108 <.0001
cc_pest - fmyd_org 57.288 3.55 35 16.126 <.0001
cc_pest - manure_org 54.934 3.55 35 15.464 <.0001
cc_pest - manure_pest -0.581 3.55 35 -0.164 1.0000
dgst_org - fmyd_org 10.722 3.55 35 3.018 0.0494
dgst_org - manure_org 8.369 3.55 35 2.356 0.1998
dgst_org - manure_pest -47.147 3.55 35 -13.272 <.0001
fmyd_org - manure_org -2.353 3.55 35 -0.662 0.9849
fmyd_org - manure_pest -57.869 3.55 35 -16.290 <.0001
manure_org - manure_pest -55.516 3.55 35 -15.628 <.0001
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 6 estimates
> emmeans(m2, list(pairwise ~ system), adjust = "tukey")
NOTE: Results may be misleading due to involvement in interactions
$`emmeans of system`
system emmean SE df lower.CL upper.CL
cc_org 53.2 1.31 7.65 48.6 57.9
cc_pest 109.5 1.31 7.65 104.9 114.2
dgst_org 63.0 1.31 7.65 58.4 67.6
fmyd_org 52.3 1.31 7.65 47.6 56.9
manure_org 54.6 1.31 7.65 50.0 59.2
manure_pest 110.1 1.31 7.65 105.5 114.7
Results are averaged over the levels of: year
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Conf-level adjustment: sidak method for 6 estimates
$`pairwise differences of system`
contrast estimate SE df t.ratio p.value
cc_org - cc_pest -56.309 1.25 141 -45.109 <.0001
cc_org - dgst_org -9.744 1.25 141 -7.806 <.0001
cc_org - fmyd_org 0.978 1.25 141 0.784 0.9699
cc_org - manure_org -1.375 1.25 141 -1.102 0.8800
cc_org - manure_pest -56.891 1.25 141 -45.575 <.0001
cc_pest - dgst_org 46.566 1.25 141 37.304 <.0001
cc_pest - fmyd_org 57.288 1.25 141 45.893 <.0001
cc_pest - manure_org 54.934 1.25 141 44.008 <.0001
cc_pest - manure_pest -0.581 1.25 141 -0.466 0.9972
dgst_org - fmyd_org 10.722 1.25 141 8.589 <.0001
dgst_org - manure_org 8.369 1.25 141 6.704 <.0001
dgst_org - manure_pest -47.147 1.25 141 -37.769 <.0001
fmyd_org - manure_org -2.353 1.25 141 -1.885 0.4156
fmyd_org - manure_pest -57.869 1.25 141 -46.359 <.0001
manure_org - manure_pest -55.516 1.25 141 -44.474 <.0001
Results are averaged over the levels of: year
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 6 estimates
> emmeans(m3, list(pairwise ~ system), adjust = "tukey")
$`emmeans of system`
system emmean SE df lower.CL upper.CL
cc_org 53.2 2.95 19.9 44.6 61.9
cc_pest 109.5 2.95 19.9 100.9 118.2
dgst_org 63.0 2.95 19.9 54.4 71.6
fmyd_org 52.3 2.95 19.9 43.6 60.9
manure_org 54.6 2.95 19.9 46.0 63.2
manure_pest 110.1 2.95 19.9 101.5 118.7
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Conf-level adjustment: sidak method for 6 estimates
$`pairwise differences of system`
contrast estimate SE df t.ratio p.value
cc_org - cc_pest -56.309 4.16 10.1 -13.550 <.0001
cc_org - dgst_org -9.744 2.16 28.0 -4.517 0.0013
cc_org - fmyd_org 0.978 2.16 28.0 0.453 0.9973
cc_org - manure_org -1.375 2.16 28.0 -0.637 0.9871
cc_org - manure_pest -56.891 4.16 10.1 -13.690 <.0001
cc_pest - dgst_org 46.566 4.16 10.1 11.206 <.0001
cc_pest - fmyd_org 57.288 4.16 10.1 13.786 <.0001
cc_pest - manure_org 54.934 4.16 10.1 13.220 <.0001
cc_pest - manure_pest -0.581 2.16 28.0 -0.269 0.9998
dgst_org - fmyd_org 10.722 2.16 28.0 4.970 0.0004
dgst_org - manure_org 8.369 2.16 28.0 3.879 0.0069
dgst_org - manure_pest -47.147 4.16 10.1 -11.346 <.0001
fmyd_org - manure_org -2.353 2.16 28.0 -1.091 0.8809
fmyd_org - manure_pest -57.869 4.16 10.1 -13.926 <.0001
manure_org - manure_pest -55.516 4.16 10.1 -13.359 <.0001
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 6 estimates