在 R 中创建开始/结束日期时间数组

发布于 2024-12-19 23:17:32 字数 849 浏览 3 评论 0原文

我正在使用 R 使用 Zoo 和 chron 进行一些时间序列分析。我有一个包含大量数据的动物园对象,需要能够使用 window 函数将数据子集为仅一天的值,然后是接下来的几天的值,然后是下一天的值 我试图找到创建一个数组的

最简单方法,其中包含特定时期内每天的日期,并提出了以下方法:

orig = c(month=1, day=1, year=2005)
dates <- chron(1:1825, origin=orig, out.format=c(dates="d/m/y", times="h:m"))

这使用儒略日表示法,并且有 1825 天(365*5 - 五年),从我约会的第一天。然后,我尝试使用该数组的每个元素执行 for 循环:

for (date in dates)
{
  s = chron(date, "00:00:00", origin=orig)
  e = chron(date, "23:59:59", origin=orig)

  aeronet_day = window(aeronet, start=s, end=e)
}

但是,这给了我一个警告,说我对 aeronet 动物园对象和 s 使用不同的来源e 变量,并且它不选择任何数据。

有更好的方法吗?或者有办法解决这个问题吗?基本上我想要的是运行一个 for 循环,在循环中我可以使用 aeronet_day = window(aeronet, start=s, end=e) 代码来生成一个包含动物园数据的动物园对象日(例如 2005 年 5 月 1 日 00:00:00 至 23:59:59。

I'm using R to do some time series analysis using zoo and chron. I've got a zoo object with lots of data in it, and need to be able to use the window function to subset the data to just one days worth, then the next days worth, then the next etc.

I've tried to find the easiest way of creating an array with the date of each day in a certain period in it and have come up with the following:

orig = c(month=1, day=1, year=2005)
dates <- chron(1:1825, origin=orig, out.format=c(dates="d/m/y", times="h:m"))

This uses the Julian day notation, and has 1825 days (365*5 - so five years), starting with the first day of my date period. I then try and do a for loop using each of the elements of this array:

for (date in dates)
{
  s = chron(date, "00:00:00", origin=orig)
  e = chron(date, "23:59:59", origin=orig)

  aeronet_day = window(aeronet, start=s, end=e)
}

However, this gives me a warning saying that I'm using different origins for the aeronet zoo object and the s and e variables, and it doesn't select any data.

Is there a better way to do this? Or a way to fix this? Basically what I want is to run a for loop where in the loop I can use the aeronet_day = window(aeronet, start=s, end=e) code to produce a zoo object containing the data for one day (eg. 1st May 2005 from 00:00:00 to 23:59:59.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

清晰传感 2024-12-26 23:17:32

假设我们有以下数据:

# create test data
library(zoo)
library(chron)
z <- zooreg(1:30, start = chron("2000-01-01"), freq = 2)

1) 聚合
R aggregate 函数有一个 Zoo 方法。第二个参数是我们聚合的依据。如果它是一个函数,它将应用于动物园对象的索引。例如,这里我们计算每个日期的平均值:

z.ag <- aggregate(z, as.Date, mean)

如果我们愿意,我们可以用更复杂的函数替换 mean

2)拆分。 R split 函数有一个zoo 方法。如果我们确实想按日期分割 z 那么我们可以这样做。这里的 z.split.list 是一个列表,其每个组件都包含一个日期的动物园对象。

z.split.list <- split(z, as.Date(time(z)))

现在(a)sapply或(b)lapply在该列表上或(c)使用以下内容(用任何处理替换print(zc)是所需要的)。这里的zc是列表的一个组成部分,即它是仅通过获取特定日期形成的动物园对象:

for(zc in z.split.list) print(zc)

请注意,as.Date(time(z))是一个向量,其中日期对应于 z 的元素。

编辑:

各种小细节。

Suppose we have this data:

# create test data
library(zoo)
library(chron)
z <- zooreg(1:30, start = chron("2000-01-01"), freq = 2)

1) aggregate
The R aggregate function has a zoo method. The second argument is what we aggregate by. If it is a function it is applied to the index of the zoo object. e.g. here we calculate the mean for each date:

z.ag <- aggregate(z, as.Date, mean)

We can replace mean with a more complex function if we wish.

2) split. The R split function has a zoo method. If we really do want to split z by date then we can do this. Here z.split.list is a list, each of whose components contains the zoo object for one date.

z.split.list <- split(z, as.Date(time(z)))

Now (a) sapply or (b) lapply over that list or (c) use the following (replacing print(zc) with whatever processing is desired). Here zc is a component of the list, i.e. it is the zoo object formed by just taking a particular date:

for(zc in z.split.list) print(zc)

Note that as.Date(time(z)) is a vector with the dates corresponding to the elements of z.

EDIT:

Various minor elaborations.

清风挽心 2024-12-26 23:17:32

我对动物园不熟悉,但我通常只是将日期转换为数字,然后进行序列,然后再次转换回来。例如:

> as.Date(Sys.Date():(Sys.Date()+365), origin='1970-01-01')
  [1] "2011-12-06" "2011-12-07" "2011-12-08" "2011-12-09" "2011-12-10" "2011-12-11" "2011-12-12" "2011-12-13"
  [9] "2011-12-14" "2011-12-15" "2011-12-16" "2011-12-17" "2011-12-18" "2011-12-19" "2011-12-20" "2011-12-21"
 [17] "2011-12-22" "2011-12-23" "2011-12-24" "2011-12-25" "2011-12-26" "2011-12-27" "2011-12-28" "2011-12-29"
 [25] "2011-12-30" "2011-12-31" "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" "2012-01-05" "2012-01-06"
 [33] "2012-01-07" "2012-01-08" "2012-01-09" "2012-01-10" "2012-01-11" "2012-01-12" "2012-01-13" "2012-01-14"
 [41] "2012-01-15" "2012-01-16" "2012-01-17" "2012-01-18" "2012-01-19" "2012-01-20" "2012-01-21" "2012-01-22"
...

I'm not familiar with zoo, but I usually just convert the date to a numeric, then make the sequence, and then convert back again. For example:

> as.Date(Sys.Date():(Sys.Date()+365), origin='1970-01-01')
  [1] "2011-12-06" "2011-12-07" "2011-12-08" "2011-12-09" "2011-12-10" "2011-12-11" "2011-12-12" "2011-12-13"
  [9] "2011-12-14" "2011-12-15" "2011-12-16" "2011-12-17" "2011-12-18" "2011-12-19" "2011-12-20" "2011-12-21"
 [17] "2011-12-22" "2011-12-23" "2011-12-24" "2011-12-25" "2011-12-26" "2011-12-27" "2011-12-28" "2011-12-29"
 [25] "2011-12-30" "2011-12-31" "2012-01-01" "2012-01-02" "2012-01-03" "2012-01-04" "2012-01-05" "2012-01-06"
 [33] "2012-01-07" "2012-01-08" "2012-01-09" "2012-01-10" "2012-01-11" "2012-01-12" "2012-01-13" "2012-01-14"
 [41] "2012-01-15" "2012-01-16" "2012-01-17" "2012-01-18" "2012-01-19" "2012-01-20" "2012-01-21" "2012-01-22"
...
呆萌少年 2024-12-26 23:17:32

如果您想按日期做某事,那么您现有的就可以了。

一些示例 aeronet 数据。

last_date <- 1825
n <- 10000
aeronet <- data.frame(
  some.value = seq_len(n), 
  date = as.chron(
    runif(n, 0, last_date), 
    origin = orig,
    out.format = c(dates = "d/m/y", times = "h:m")
  )
)

现在,您可以使用 split 按日期拆分数据,或者使用 plyrtapply 或 ddply 将函数应用于每个日期code> (或使用 aggregate 或其他)。

with(aeronet, split(some.value, date))
with(aeronet, tapply(some.value, date, sum))

library(plyr)
ddply(aeronet, .(date), summarise, sum(some.value))

If you want to do something on a per date basis, then what you have is fine.

Some sample aeronet data.

last_date <- 1825
n <- 10000
aeronet <- data.frame(
  some.value = seq_len(n), 
  date = as.chron(
    runif(n, 0, last_date), 
    origin = orig,
    out.format = c(dates = "d/m/y", times = "h:m")
  )
)

Now you can split the data by date using split, or apply a function to each date with tapply or ddply from plyr (or use aggregate or whatever).

with(aeronet, split(some.value, date))
with(aeronet, tapply(some.value, date, sum))

library(plyr)
ddply(aeronet, .(date), summarise, sum(some.value))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文