聚合也可以在不使用的情况下工作zoo
(来自 3 天的 2 个变量的随机数据和来自 JWM 的 4 个主机)。我假设您每小时都有来自所有主机的数据。
nHosts <- 4 # number of hosts
dates <- seq(as.POSIXct("2011-01-01 00:00:00"),
as.POSIXct("2011-01-03 23:59:30"), by=30)
hosts <- factor(sample(1:nHosts, length(dates), replace=TRUE),
labels=paste("host", 1:nHosts, sep=""))
x1 <- sample(0:20, length(dates), replace=TRUE) # data from 1st variable
x2 <- rpois(length(dates), 2) # data from 2nd variable
Data <- data.frame(dates=dates, hosts=hosts, x1=x1, x2=x2)
我不完全确定您是想在每个小时内进行平均,还是在所有天的每个小时内进行平均。我都会做。
Data$hFac <- droplevels(cut(Data$dates, breaks="hour"))
Data$hour <- as.POSIXlt(dates)$hour # extract hour of the day
# average both variables over days within each hour and host
# formula notation was introduced in R 2.12.0 I think
res1 <- aggregate(cbind(x1, x2) ~ hour + hosts, data=Data, FUN=mean)
# only average both variables within each hour and host
res2 <- aggregate(cbind(x1, x2) ~ hFac + hosts, data=Data, FUN=mean)
结果如下所示:
> head(res1)
hour hosts x1 x2
1 0 host1 9.578431 2.049020
2 1 host1 10.200000 2.200000
3 2 host1 10.423077 2.153846
4 3 host1 10.241758 1.879121
5 4 host1 8.574713 2.011494
6 5 host1 9.670588 2.070588
> head(res2)
hFac hosts x1 x2
1 2011-01-01 00:00:00 host1 9.192308 2.307692
2 2011-01-01 01:00:00 host1 10.677419 2.064516
3 2011-01-01 02:00:00 host1 11.041667 1.875000
4 2011-01-01 03:00:00 host1 10.448276 1.965517
5 2011-01-01 04:00:00 host1 8.555556 2.074074
6 2011-01-01 05:00:00 host1 8.809524 2.095238
我也不完全确定您想要的图表类型。这是图表的基本版本,仅用于第一个变量,每个主机都有单独的数据线。
# using the data that is averaged over days as well
res1L <- split(subset(res1, select="x1"), res1$hosts)
mat1 <- do.call(cbind, res1L)
colnames(mat1) <- levels(hosts)
rownames(mat1) <- 0:23
matplot(mat1, main="x1 per hour, avg. over days", xaxt="n", type="o", pch=16, lty=1)
axis(side=1, at=seq(0, 23, by=2))
legend(x="topleft", legend=colnames(mat1), col=1:nHosts, lty=1)
仅在每小时内平均的数据的相同图表。
res2L <- split(subset(res2, select="x1"), res2$hosts)
mat2 <- do.call(cbind, res2L)
colnames(mat2) <- levels(hosts)
rownames(mat2) <- levels(Data$hFac)
matplot(mat2, main="x1 per hour", type="o", pch=16, lty=1)
legend(x="topleft", legend=colnames(mat2), col=1:nHosts, lty=1)