使用 XTS 查找早于某个时间戳的最新观察结果
我有一个 xts
对象,如下所示:
> q.xts
val
2011-08-31 09:30:00.002357 -1.0135222
2011-08-31 09:30:00.003443 -0.2182679
2011-08-31 09:30:00.005075 -0.5317191
2011-08-31 09:30:00.009515 -1.0639535
2011-08-31 09:30:00.011569 -1.2470759
2011-08-31 09:30:00.012144 0.7678103
2011-08-31 09:30:00.023813 -0.6303432
2011-08-31 09:30:00.024107 -0.5105943
我根据另一个数据帧 r
中的时间戳计算固定偏移量。 r
中的行数明显少于 q.xts
中的行数。
> r
time predict.time
1 2011-08-31 09:30:00.003443 2011-08-31 09:30:00.002443
2 2011-08-31 09:30:00.009515 2011-08-31 09:30:00.008515
3 2011-08-31 09:30:00.024107 2011-08-31 09:30:00.023108
time
列对应于 q.xts
的观察结果,而 predict.time
列比 time
早 1 毫秒>(减去任何精度舍入)。
我想要做的是从 q.xts
中找到等于或早于 predict.time
每个值的时间的最后一个观察结果。对于上面 r
中的三个观察结果,我期望出现以下匹配:
time predict.time (time from q.xts)
1 2011-08-31 09:30:00.003443 2011-08-31 09:30:00.002443 --> 09:30:00.002357
2 2011-08-31 09:30:00.009515 2011-08-31 09:30:00.008515 --> 09:30:00.005075
3 2011-08-31 09:30:00.024107 2011-08-31 09:30:00.023108 --> 09:30:00.012144
我通过循环 r
中的每一行并执行 xts 子集
来实现这一点>。因此,对于 r
的第 1 行,我会这样做:
> last(index(q.xts[paste('/', r[1,]$predict.time, sep='')]))
[1] "2011-08-31 09:30:00.002357 CDT"
问题:用循环执行此操作似乎笨拙且尴尬。有更好的办法吗?我希望在 r
中得到另一列,它提供 q.xts
中相应值的准确时间或行号。
注意:使用它来构建我用于此示例的数据:
q <- read.csv(tc <- textConnection("
2011-08-31 09:30:00.002358, -1.01352216
2011-08-31 09:30:00.003443, -0.21826793
2011-08-31 09:30:00.005076, -0.53171913
2011-08-31 09:30:00.009515, -1.06395353
2011-08-31 09:30:00.011570, -1.24707591
2011-08-31 09:30:00.012144, 0.76781028
2011-08-31 09:30:00.023814, -0.63034317
2011-08-31 09:30:00.024108, -0.51059425"),
header=FALSE); close(tc)
colnames(q) <- c('datetime', 'val')
q.xts <- xts(q[-1], as.POSIXct(q$datetime))
r <- read.csv(tc <- textConnection("
2011-08-31 09:30:00.003443
2011-08-31 09:30:00.009515
2011-08-31 09:30:00.024108"),
header=FALSE); close(tc)
colnames(r) <- c('time')
r$time <- as.POSIXct(strptime(r$time, '%Y-%m-%d %H:%M:%OS'))
r$predict.time <- r$time - 0.001
I have an xts
object that looks like this:
> q.xts
val
2011-08-31 09:30:00.002357 -1.0135222
2011-08-31 09:30:00.003443 -0.2182679
2011-08-31 09:30:00.005075 -0.5317191
2011-08-31 09:30:00.009515 -1.0639535
2011-08-31 09:30:00.011569 -1.2470759
2011-08-31 09:30:00.012144 0.7678103
2011-08-31 09:30:00.023813 -0.6303432
2011-08-31 09:30:00.024107 -0.5105943
I calculate a fixed offset from timestamps in another data frame, r
. The number of rows in r
is significantly fewer than the number of rows in q.xts
.
> r
time predict.time
1 2011-08-31 09:30:00.003443 2011-08-31 09:30:00.002443
2 2011-08-31 09:30:00.009515 2011-08-31 09:30:00.008515
3 2011-08-31 09:30:00.024107 2011-08-31 09:30:00.023108
The time
column corresponds to an observation from q.xts
while the predict.time
column is 1 millisecond earlier than time
(less any precision round offs).
What I would like to do is find the last observation from q.xts
that is equal to or earlier in time for each value of predict.time
. For the three observations in r
above I would expect the following matches:
time predict.time (time from q.xts)
1 2011-08-31 09:30:00.003443 2011-08-31 09:30:00.002443 --> 09:30:00.002357
2 2011-08-31 09:30:00.009515 2011-08-31 09:30:00.008515 --> 09:30:00.005075
3 2011-08-31 09:30:00.024107 2011-08-31 09:30:00.023108 --> 09:30:00.012144
I had approached this by looping over each row in r
and performing an xts subset
. So, for row 1 of r
I would do:
> last(index(q.xts[paste('/', r[1,]$predict.time, sep='')]))
[1] "2011-08-31 09:30:00.002357 CDT"
QUESTION: Doing this with a loop seems clunky and awkward. Is there a better way? I would like to end up with another column in r
that provides the exact time or row number for the corresponding value in q.xts
.
NOTE: Use this to build the data I've used for this example:
q <- read.csv(tc <- textConnection("
2011-08-31 09:30:00.002358, -1.01352216
2011-08-31 09:30:00.003443, -0.21826793
2011-08-31 09:30:00.005076, -0.53171913
2011-08-31 09:30:00.009515, -1.06395353
2011-08-31 09:30:00.011570, -1.24707591
2011-08-31 09:30:00.012144, 0.76781028
2011-08-31 09:30:00.023814, -0.63034317
2011-08-31 09:30:00.024108, -0.51059425"),
header=FALSE); close(tc)
colnames(q) <- c('datetime', 'val')
q.xts <- xts(q[-1], as.POSIXct(q$datetime))
r <- read.csv(tc <- textConnection("
2011-08-31 09:30:00.003443
2011-08-31 09:30:00.009515
2011-08-31 09:30:00.024108"),
header=FALSE); close(tc)
colnames(r) <- c('time')
r$time <- as.POSIXct(strptime(r$time, '%Y-%m-%d %H:%M:%OS'))
r$predict.time <- r$time - 0.001
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可能有更好的方法来做到这一点,但这是我目前能想到的最好的方法。
There may be a better way to do this, but this is the best I can come up with at the moment.
我会使用
findInterval
:你的时间是
POSIXct
所以这应该相当稳健。I'd use
findInterval
:Your times are
POSIXct
so this should be fairly robust.