我在 R 中使用 GAM 来比较来自 3 个国家/地区的时间序列数据。数据集是一年的每小时测量值。这里的主要目的是显示在一天中的哪个时间和一年中的哪一天,数据与平均序列非常匹配。下面的脚本显示了我的尝试:
## Ozone measurements for three countries in Europe
## Find similarities between time series
require(plyr)
require(lattice)
require(mgcv)
TopFolder <- list("http://www.nilu.no/projects/ccc/onlinedata/ozone/CZ03_2009.dat"
,"http://www.nilu.no/projects/ccc/onlinedata/ozone/CY02_2009.dat"
,"http://www.nilu.no/projects/ccc/onlinedata/ozone/BE35_2009.dat"
)
## create variable for the data
data = ldply(TopFolder, header = TRUE, read.table, sep = "", skip = 3)
## define Ozone levels
Ozone <- data$Value
Ozone[Ozone==-999] <- NA
Ozone <- data.frame(Ozone)
## define Datetime - need to concatenate arrays
DateTime <- paste(data$Date,data$Hour, sep = " ")
Date <- as.POSIXct(DateTime, format = "%d.%m.%Y %H:%M")
## define countries
Countries <- c("Czech","Cyprus","Belgium")
Country <- data.frame(Country = rep(Countries, each = 8760))
## bind together
Dat <- cbind(Ozone, Country = Country)
Dat <- transform(Dat, Doy = as.numeric(format(Date,format = "%j")),
Tod = as.numeric(format(Date,format = "%H")),
DecTime = rep(seq(1,365, length = 8760),by = 3))
## plot the Ozone data
xyplot(Ozone~DecTime | Country, data = Dat, type = "l", col = 1,
strip = function(bg = 'white',...)strip.default(bg = 'white',...))
## generalised additive model
mod1 <- gam(Ozone ~ Country + s(Doy,bs = "cc", k = 20) + s(Doy, by = Country, bs = "cc", k = 20) +
s(Tod,bs = "cc", k=7) + s(Tod,by = Country,bs = "cr", k=7),
data = Dat, method = "ML")
plot(mod1,pages=1,scale=0,shade=TRUE)
xyplot(resid(mod1) ~ Doy | Country, data = Dat, type = c("l","smooth"))
mod2 <- gamm(Ozone ~ Country + s(Doy,bs = "cc", k = 20) + s(Doy, by = Country, bs = "cc", k = 20) +
s(Tod,bs = "cc", k=7) + s(Tod,by = Country,bs = "cr", k=7),
data = Dat, method = "ML",correlation = corAR1(form = ~ DecTime | Country))
一个问题是我使用了一个具有相关误差的加法模型,即在每小时测量的每个时间序列中,高浓度之后是另一个高浓度。这可以通过在 GAM 中添加相关矩阵来解决:
但是,R 会抛出一个错误,因为它无法从操作系统获取更多 RAM。因此,为了解决这个问题,我需要: (1) 从 mod1 - 将时间序列模型 (mod3) 拟合到残差 (2) 从 mod3 获得 acf(1) (2) 使用 acf(1) 构建协方差然后我们可以将其传递给 mod2 矩阵,从而简化相关矩阵。
任何有关如何完成上述三个步骤中的部分或全部步骤的建议将不胜感激。虽然我认为我不确定的第一点是如何在 mod1 中提取不同国家的残差?