如何按时间顺序聚合数据?

发布于 2024-12-13 07:26:46 字数 468 浏览 6 评论 0原文

我有一个问题,真的需要你的帮助吗?我的数据(让我们将其命名为“日期”)如下所示:

location       date  value
       1 2010-01-01    6.4
       1 2010-01-02    5.7
       .
       .  
       2 2010-01-01    0.8
       2 2010-01-02    2.5
       2 2010-01-03    5.5

我想聚合 3 周期间内的位置数据(值)?我已经尝试使用包 timeSeries 但它不起作用?

by1 <- timeSequence(from = "2009-12-30", to = "2010-12-29", by= "4 week")
by1
aggregate(date, by=list(by1, date$location), sum)

I have a problem and would really need your help? My data (let's name it "date") looks like this:

location       date  value
       1 2010-01-01    6.4
       1 2010-01-02    5.7
       .
       .  
       2 2010-01-01    0.8
       2 2010-01-02    2.5
       2 2010-01-03    5.5

I would like to aggregate data (value) on location and on 3 weeks period? I have already try to use package timeSeries but it is not working?

by1 <- timeSequence(from = "2009-12-30", to = "2010-12-29", by= "4 week")
by1
aggregate(date, by=list(by1, date$location), sum)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

南…巷孤猫 2024-12-20 07:26:46

以下是一种使用 seq.Date 生成断点、cut 来存储数据以及 ddply 进行总结的方法:

# Create sample data
set.seed(1)
dat <- data.frame(
  location = rep(1:3, each=30),
  date = rep(seq(as.Date("2010-01-01"), by="3 day", length.out=30), 3),
  value=rnorm(90)
)

# Create a sequence of dates in period of 3 weeks ot serve as cut points
dateCuts <- seq(from=min(dat$date), to=max(dat$date)+31, by="3 week")

# Use cut to separate dates into periods
dat$period <- cut(dat$date, breaks=dateCuts)

# Summarise data
library(plyr)
ddply(dat, .(location, period), summarize, value=mean(value))

结果:

   location     period       value
1         1 2010-01-01  0.04475859
2         1 2010-01-22  0.01062880
3         1 2010-02-12  0.62024902
4         1 2010-03-05 -0.31364304
5         1 2010-03-26 -0.03010425
6         2 2010-01-01 -0.08522653
7         2 2010-01-22  0.37708986
8         2 2010-02-12  0.12910449
9         2 2010-03-05  0.08597110
10        2 2010-03-26  0.21733251
11        3 2010-01-01  0.10295425
12        3 2010-01-22  0.46194453
13        3 2010-02-12 -0.35546029
14        3 2010-03-05  0.17216486
15        3 2010-03-26  0.31855880

Here is an approach using seq.Date to generate the break points, cut to bin your data, and ddply to summarize:

# Create sample data
set.seed(1)
dat <- data.frame(
  location = rep(1:3, each=30),
  date = rep(seq(as.Date("2010-01-01"), by="3 day", length.out=30), 3),
  value=rnorm(90)
)

# Create a sequence of dates in period of 3 weeks ot serve as cut points
dateCuts <- seq(from=min(dat$date), to=max(dat$date)+31, by="3 week")

# Use cut to separate dates into periods
dat$period <- cut(dat$date, breaks=dateCuts)

# Summarise data
library(plyr)
ddply(dat, .(location, period), summarize, value=mean(value))

The results:

   location     period       value
1         1 2010-01-01  0.04475859
2         1 2010-01-22  0.01062880
3         1 2010-02-12  0.62024902
4         1 2010-03-05 -0.31364304
5         1 2010-03-26 -0.03010425
6         2 2010-01-01 -0.08522653
7         2 2010-01-22  0.37708986
8         2 2010-02-12  0.12910449
9         2 2010-03-05  0.08597110
10        2 2010-03-26  0.21733251
11        3 2010-01-01  0.10295425
12        3 2010-01-22  0.46194453
13        3 2010-02-12 -0.35546029
14        3 2010-03-05  0.17216486
15        3 2010-03-26  0.31855880
陪你到最终 2024-12-20 07:26:46

使用lubridate,您可以用一行代码编写此代码。

ddply(dat, .(location, ceiling(week(date)/3)), summarize, value = mean(value))

Using lubridate, you can write this with a one-liner.

ddply(dat, .(location, ceiling(week(date)/3)), summarize, value = mean(value))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文