如何使用具有相关结构的 GLS 来比较两个温度时间序列?

机器算法验证 r 时间序列 相关性 广义最小二乘法
2022-04-17 05:30:50

我有 2 个温度时间序列,在一个海湾的 2 个站点用 2 个传感器记录。传感器在相同的时间段内以相同的频率进行记录。2 个站点具有不同的深度(A = 深,B = 浅)。

我想能够说浅处是否比深处温暖。

我对时间序列的形状不感兴趣。

我的数据集有一个变量site,一个是julian day (jday),然后是每天的平均温度(temp.avg)和标准误差(se,仅用于绘图)

dTMP<-structure(list(jday = c(188L, 188L, 189L, 189L, 190L, 190L, 191L,
191L, 192L, 192L, 193L, 193L, 194L, 194L, 195L, 195L, 196L, 196L, 
197L, 197L, 198L, 198L, 199L, 199L, 200L, 200L, 201L, 201L, 202L, 
202L, 203L, 203L, 204L, 204L, 205L, 205L, 206L, 206L, 207L, 207L, 
208L, 208L, 209L, 209L, 210L, 210L, 211L, 211L, 212L, 212L, 213L, 
213L, 214L, 214L, 215L, 215L, 216L, 216L, 217L, 217L, 218L, 218L, 
219L, 219L, 220L, 220L, 221L, 221L, 222L, 222L, 223L, 223L, 224L, 
224L, 225L, 225L, 226L, 226L, 227L, 227L, 228L, 228L, 229L, 229L, 
230L, 230L, 231L, 231L, 232L, 232L, 233L, 233L, 234L, 234L, 235L, 
235L, 236L, 236L, 237L, 237L, 238L, 238L, 239L, 239L, 240L, 240L, 
241L, 241L, 242L, 242L, 243L, 243L, 244L, 244L, 245L, 245L, 246L, 
246L, 247L, 247L, 248L, 248L, 249L, 249L, 250L, 250L, 251L, 251L, 
252L, 252L, 253L, 253L, 254L, 254L, 255L, 255L, 256L, 256L, 257L, 
257L, 258L, 258L, 259L, 259L, 260L, 260L, 261L, 261L, 262L, 262L, 
263L, 263L, 264L, 264L, 265L, 265L, 266L, 266L, 267L, 267L, 268L, 
268L, 269L, 269L, 270L, 270L, 271L, 271L, 272L, 272L, 273L, 273L, 
274L, 274L, 275L, 275L, 276L, 276L, 277L, 277L, 278L, 278L, 279L, 
279L, 280L, 280L, 281L, 281L), site = c("A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", 
"B"), temp.avg = c(5.99097902097902, 6.09086713286713, 7.42745833333333, 
8.1678125, 6.88500694444444, 8.23406944444444, 6.12922222222222, 
7.26203472222222, 6.65711111111111, 6.40924305555556, 8.31249305555555, 
8.32240277777778, 7.90968055555556, 8.42810416666667, 8.95072222222222, 
9.70318055555556, 9.85254861111111, 11.4095486111111, 9.16079166666667, 
11.0159375, 7.79172916666667, 9.49722222222222, 5.92420138888889, 
7.548875, 5.80441666666667, 6.89536805555556, 5.63051388888889, 
6.84440277777778, 6.24066666666667, 7.36372222222222, 5.56700694444444, 
6.445375, 6.19682638888889, 6.48272222222222, 6.77991666666667, 
7.06177777777778, 7.8626875, 8.15279861111111, 8.97194444444444, 
9.16716666666667, 10.5453472222222, 10.578875, 10.2557361111111, 
11.9828888888889, 10.7251736111111, 12.4082430555556, 11.4249166666667, 
12.098875, 11.8695, 12.2577361111111, 11.9151319444444, 12.2548541666667, 
11.4099513888889, 11.7989097222222, 11.0229027777778, 11.2145, 
10.8331736111111, 11.2092083333333, 11.2498819444444, 11.6196180555556, 
11.2190416666667, 12.3128958333333, 9.60270138888889, 11.6026111111111, 
8.4145, 9.01735416666667, 9.02104861111111, 9.877, 9.37068055555556, 
10.2840416666667, 9.59797222222222, 10.5512777777778, 9.90951388888889, 
10.4284722222222, 10.5020347222222, 11.006375, 11.0382777777778, 
11.6005972222222, 11.0032222222222, 11.5620277777778, 11.9508472222222, 
12.2476041666667, 11.3607430555556, 12.1189722222222, 10.9222222222222, 
12.3047291666667, 9.87395138888889, 10.6235972222222, 10.3749027777778, 
10.7676666666667, 10.7136388888889, 11.9661875, 9.41833333333333, 
11.2963611111111, 9.05438194444444, 11.3133680555556, 9.29961805555556, 
10.7892638888889, 9.13478472222222, 9.78157638888889, 10.0053055555556, 
10.2844236111111, 11.2174722222222, 11.3485555555556, 10.5509652777778, 
11.3064583333333, 10.9914722222222, 11.1644097222222, 11.0532986111111, 
11.3837847222222, 10.6440347222222, 11.1492361111111, 10.2033819444444, 
10.906625, 8.8553125, 11.3004097222222, 7.00258333333333, 9.425375, 
7.39536805555556, 8.19988888888889, 6.81268055555556, 7.90986805555556, 
6.98128472222222, 7.428125, 7.55517361111111, 8.1825625, 7.2924375, 
7.61071527777778, 8.64228472222222, 8.76127083333333, 9.1655, 
9.96297916666667, 9.12442361111111, 9.43466666666667, 9.36382638888889, 
9.8573125, 7.73863888888889, 9.06475, 6.86245138888889, 7.268, 
7.28360416666667, 8.03669444444444, 7.46970833333333, 7.80116666666667, 
7.90136111111111, 8.77240972222222, 8.38903472222222, 8.69964583333333, 
9.0513125, 9.15219444444444, 10.0108819444444, 9.94815277777778, 
9.73478472222222, 10.1262916666667, 9.17533333333333, 9.6718125, 
9.18674305555555, 10.1854375, 9.16936111111111, 10.0093680555556, 
8.82407638888889, 9.55520833333333, 9.40145833333333, 9.53936111111111, 
8.80591666666667, 9.36661805555556, 8.04295138888889, 8.44384722222222, 
7.49107638888889, 9.37970138888889, 6.54023611111111, 7.47807638888889, 
6.58035416666667, 7.26764583333333, 6.2289375, 6.55975694444444, 
6.55352083333333, 6.78065277777778, 7.3918125, 7.81163194444444, 
8.44518055555556, 9.05359722222222, 8.69172222222222, 9.11545138888889, 
9.03126388888889, 9.05704166666667, 7.69626388888889, 8.15723611111111
), se = c(0.0684446311524084, 0.0467000338163894, 0.0637176833357706, 
0.0776305990169131, 0.0613172362964014, 0.0736477668094532, 0.0594146388107844, 
0.0431317487238723, 0.0968421264052432, 0.0419846933555372, 0.0323949155766505, 
0.0403999999512505, 0.0426753087710917, 0.04431240712146, 0.0685862399045098, 
0.0938783494125271, 0.0908408653823437, 0.0266915764154258, 0.0953725248690779, 
0.105824891014135, 0.0990518606457859, 0.0410508152091538, 0.06043250261061, 
0.0786330141878098, 0.0652480589576939, 0.0696827474348336, 0.0535061067197427, 
0.0741190834496391, 0.0701916014695556, 0.0735485250204054, 0.0458221447421354, 
0.0404458878036966, 0.0359588714516765, 0.0265982467327904, 0.0394296624495542, 
0.0427714222970342, 0.0221140283854524, 0.0384965936341351, 0.0782688867075244, 
0.0247393796918376, 0.0382087990790521, 0.0475342767536421, 0.0479178691306617, 
0.0416523624760465, 0.0559318984651426, 0.114747800062252, 0.0320540477244992, 
0.0779599317541411, 0.0221973209356683, 0.0636192257453038, 0.0292632222064691, 
0.0333513179466007, 0.0355241640648164, 0.0432614608297386, 0.0385724506235991, 
0.037197666261358, 0.0402237976528405, 0.0491513763461124, 0.0612527892079573, 
0.0568085063216254, 0.0687175273844523, 0.0294338061276067, 0.0636253008809029, 
0.0912451311678636, 0.0384309704673781, 0.0221135650232409, 0.0499603906142495, 
0.0732352504170765, 0.0496670437447336, 0.0392916567790627, 0.0518190951599677, 
0.0451864397798832, 0.0449763902345077, 0.0376188816790774, 0.0353480974964637, 
0.043738970054825, 0.0261383200661106, 0.0233255404890622, 0.0552663268605627, 
0.0244428852057977, 0.0366775615825884, 0.0262850373316128, 0.0611982483162885, 
0.0555639414319732, 0.0795167431194783, 0.0484336444188232, 0.0580722515120101, 
0.0511506568571288, 0.0695252481623487, 0.047652522678636, 0.131585829693245, 
0.0672199989613813, 0.0873260603254596, 0.106620462572398, 0.0429877006186673, 
0.097603870391481, 0.044142339472256, 0.0616361009000405, 0.0444095513606984, 
0.0671187567217355, 0.0882926151247376, 0.0591708833159303, 0.0722583700480519, 
0.0521225933874304, 0.0867660870781919, 0.0927982580298578, 0.0286604318185285, 
0.0165244497961037, 0.0215911564780479, 0.0271993439034762, 0.033820745033179, 
0.0287039604443949, 0.046389154669938, 0.030690306262797, 0.144168379023189, 
0.0656152472038106, 0.0713610892589061, 0.118974858624633, 0.100850818065691, 
0.0946383067351734, 0.0526435227437007, 0.105288190570795, 0.0590228559096837, 
0.0641402829806298, 0.0292958864631491, 0.0528746278867272, 0.0201978238929515, 
0.0140800818414335, 0.0442948108734875, 0.0693191947399626, 0.0144639720670529, 
0.04125110776308, 0.0095055017186666, 0.0303011849671232, 0.0125464998735671, 
0.0489897150266719, 0.0751016435113841, 0.0789040475999884, 0.0440936049937659, 
0.0211301628036506, 0.0755702005272336, 0.0540554625729186, 0.073399126077332, 
0.0956125027382699, 0.0414118860902094, 0.04826324636017, 0.0501105502933141, 
0.0534612307413563, 0.0340665419463383, 0.0207860495534631, 0.0147775084791049, 
0.0124121556684425, 0.0176000529860602, 0.0147256372851489, 0.0289876122297381, 
0.0244796725385413, 0.0127740158302701, 0.0158948404114211, 0.0242120175935621, 
0.0228040488763762, 0.015269392245137, 0.0334098420839866, 0.0147282950908125, 
0.0352636487530586, 0.0524948183516093, 0.0327090521078623, 0.0427079881785599, 
0.0573994892952229, 0.062267865006105, 0.0297565541281921, 0.0142227669754567, 
0.0675999462075173, 0.00781516310476744, 0.0415315319925174, 
0.0295436118614684, 0.0251498717608074, 0.0465980555908037, 0.0382177558832005, 
0.0449598834488143, 0.0293496702138142, 0.0151548810889912, 0.0434966522028971, 
0.0364841418068765, 0.0294334202340891, 0.0233746698992999, 0.00814461650841544, 
0.0172561458993778, 0.0183854419086275)), class = "data.frame", row.names = c(NA, 
-188L), .Names = c("jday", "site", "temp.avg", "se"))

 dTMP$site<-as.factor(dTMP$site)   

如果我绘制数据,我会得到这张图:

plot(temp.avg~jday, data=dTMP[dTMP$site=='A',], type='p', col='blue',ylim=c(5,13), pch=20 )
lines(temp.avg~jday, data=dTMP[dTMP$site=='A',], type='l', col='blue',ylim=c(5,13), lty=1 )
with(dTMP[dTMP$site=='A',], polygon(x=c(jday, rev(jday)) , y= c(temp.avg+1.96*se,rev(temp.avg-1.96*se)), border=NA, col=rgb(0,0,0,0.2) ))
par(new=T)
plot(temp.avg~jday, data=dTMP[dTMP$site=='B',], type='p', col='red',ylim=c(5,13), pch=20 )
lines(temp.avg~jday, data=dTMP[dTMP$site=='B',], type='l', col='red',ylim=c(5,13), lty=1 )
with(dTMP[dTMP$site=='B',], polygon(x=c(jday, rev(jday)) , y= c(temp.avg+1.96*se,rev(temp.avg-1.96*se)), border=NA, col=rgb(0,0,0,0.2) ))
legend(x='topleft',legend=c('A','B'), pch=c(20,20),col=c('blue','red'))

在此处输入图像描述

从这个情节我可以看出,这两个系列在形状上非常相似,但一个总是低于另一个。此外,95% CI(阴影区域)几乎从不重叠,所以我预计这两个站点之间存在显着差异。

现在,我认为回答我的问题的代码(即一个站点比另一个站点更温暖?)是:

library (nlme)    
cs1 <- corARMA(c(0.2, 0.3), form = ~ 1 | site, p = 1, q = 1)
m2a<-gls(temp.avg~site, data=dTMP, correlation=cs1)

summary(m2a)
anova(m2a)

但是,我可以获得的摘要表明没有区别:

Generalized least squares fit by REML
  Model: temp ~ site 
  Data: dTMP 
       AIC      BIC    logLik
  454.5677 470.6964 -222.2839

Correlation Structure: ARMA(1,1)
 Formula: ~1 | site 
 Parameter estimate(s):
     Phi1    Theta1 
0.8661159 0.2603911 

Coefficients:
               Value Std.Error   t-value p-value
(Intercept) 8.550266 0.7183725 11.902274  0.0000
siteB       0.684303 1.0159321  0.673572  0.5014

 Correlation: 
            (Intr)
siteB -0.707

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-1.6126448 -0.5530600  0.2409433  0.9841246  1.7444175 

Residual standard error: 1.949408 
Degrees of freedom: 188 total; 186 residual

我现在的问题是:

1)我是否正确编码模型以考虑站点内的自相关?即是否正确

表格 = ~ 1 | 地点

在我的相关结构中?如果不是我应该使用什么,为什么?

2)用我写的代码,我到底在做什么?我是在比较2系列的形状吗?

我读了 pinheiro 和 bates 2000 和 Zuur 2009,但我不明白。现在我比以往任何时候都更加困惑。

2个回答

回答您的问题的一种方法(我想能够说浅处是否比深处温暖。)将研究两个时间序列 SiteB - SiteA 之间的差异(A=deep,B=浅的)。

两个时间序列都是平稳的。因此时间序列的均值与时间无关。两个时间序列都可以用一个简单的 AR(1) 模型很好地表示。

根据您的数据,我为时间序列 SiteB - SiteA 找到了这个 ARMA 模型:

arima(dTMP_BmA, order = c(2,0,1))
Call:
arima(x = dTMP_BmA, order = c(2, 0, 1))

Coefficients:
         ar1      ar2      ma1  intercept
      1.2600  -0.5131  -0.7269     0.7458
s.e.  0.1694   0.0973   0.1769     0.0553

sigma^2 estimated as 0.2353:  log likelihood = -65.62,  aic = 139.24

残差的自相关图没有显示异常。

“截距”参数 (0.7458) 实际上是时间序列 siteB - SiteA 的估计平均值,标准误差为 0.0553。从这个结果中,我们可以得出结论,SiteB 的平均值和 SiteA 的平均值之间存在显着差异。

两个时间序列的模型 m3a 残差的 1 阶自回归相关性非常高(A:约 0.4 和 B:约 0.6),均在 0.05 水平上显着。因此,summary() 的结果是无效的。

两个 ts 的自相关系数可以通过以下方式可视化:

acf2(residuals(m3a,type="normalized")[seq(1,187,2)],max.lag=50) (注意:来自 pkg astsa 的 acf2) acf2(residuals(m3a,type="normalized")[seq( 2,188,2)],max.lag=50)

增加 AR 阶 p 和 MA 阶 q 将无助于减轻显着的一阶自相关系数。为了减少第一自相关系数,一种方法是结合频域方法。这就是我所做的。

dTMP$time = rep(1:94, each = 2)

m3a<-gls(温度。avg~site+ sin(2*pi*time*1/94)+cos(2*pi*time*1/94)+ sin(2*pi*time*2/94)+ sin(2*pi*time* 3/94)+cos(2*pi*time*3/94)+ sin(2*pi*time*4/94)+cos(2*pi*time*4/94)+ sin(2*pi*时间*5/94)+cos(2*pi*time*5/94)+ sin(2*pi*time*6/94)+cos(2*pi*time*6/94)+ sin(2* pi*time*7/94)+cos(2*pi*time*7/94)+ sin(2*pi*time*8/94)+ cos(2*pi*time*9/94)+ cos( 2*pi*time*10/94)+ sin(2*pi*time*11/94)+ # sin(2*pi*time*12/94)+cos(2*pi*time*12/94) + sin(2*pi*time*13/94)+cos(2*pi*time*13/94)+ # sin(2*pi*time*14/94)+cos(2*pi*time*14 /94)+ cos(2*pi*time*15/94)+ sin(2*pi*time*18/94)+cos(2*pi*time*18/94)+ sin(2*pi*time *23/94)+cos(2*pi*time*23/94)+ # sin(2*pi*time*29/94)+cos(2*pi*time*29/94)+sin(2*pi*time*34/94), # sin(2*pi*time*39/94)+cos(2*pi*time*39/94), data=dTMP, correlation=corARMA(form= ~1|站点,p=1,q=0))

检查新模型的自相关

acf2(residuals(m3a,type="normalized")[seq(1,187,2)],max.lag=50) acf2(residuals(m3a,type="normalized")[seq(2,188,2)],max.滞后=50)

站点 A 和 B 的一阶自相关系数分别降低到大约 0.1 和 0.2。这些好多了。

如果您通过摘要(m3a)检查结果,您将看到站点的显着影响(p = 0)