使用日期作为索引从向量中选择值

发布于 2024-08-30 08:31:25 字数 375 浏览 5 评论 0原文

假设我有一个命名向量 bar

bar=c()
bar["1997-10-14"]=1
bar["2001-10-14"]=2
bar["2007-10-14"]=1

如何从 bar 中选择索引位于特定日期范围内的所有值?因此,如果我查找 "1995-01-01""2000-06-01" 之间的所有值,我应该得到 1 。同样,对于 "2001-09-01""2007-11-04" 之间的时间段,我应该得到 21

Suppose I have a named vector, bar:

bar=c()
bar["1997-10-14"]=1
bar["2001-10-14"]=2
bar["2007-10-14"]=1

How can I select from bar all values for which the index is within a specific date range? So, if I look for all values between "1995-01-01" and "2000-06-01", I should get 1. And similarly for the period between "2001-09-01" and "2007-11-04", I should get 2 and 1.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

时光是把杀猪刀 2024-09-06 08:31:25

这个问题已经通过 xts 包得到了很好的解决,该包扩展了 zoo 包。

R> library(xts)
Loading required package: zoo
R> bar <- xts(1:3, order.by=as.Date("2001-01-01")+365*0:2)
R> bar
           [,1]
2001-01-01    1
2002-01-01    2
2003-01-01    3
R> bar["2002::"]        ## open range with a start year
           [,1]
2002-01-01    2
2003-01-01    3
R> bar["::2002"]        ## or end year
           [,1]
2001-01-01    1
2002-01-01    2
R> bar["2002-01-01"]    ## or hits a particular date
           [,1]
2002-01-01    2
R> 

这里还有很多内容——但基本点是不要对伪装成日期的字符串进行操作。

使用 Date 类型,或者最好使用一个扩展包来有效地索引数百万个日期。

This problem has been solved for good with the xts package which extends functionality from the zoo package.

R> library(xts)
Loading required package: zoo
R> bar <- xts(1:3, order.by=as.Date("2001-01-01")+365*0:2)
R> bar
           [,1]
2001-01-01    1
2002-01-01    2
2003-01-01    3
R> bar["2002::"]        ## open range with a start year
           [,1]
2002-01-01    2
2003-01-01    3
R> bar["::2002"]        ## or end year
           [,1]
2001-01-01    1
2002-01-01    2
R> bar["2002-01-01"]    ## or hits a particular date
           [,1]
2002-01-01    2
R> 

There is a lot more here -- but the basic point is do not operate on strings masquerading as dates.

Use a Date type, or preferably even an extension package built to efficiently index on millions of dates.

单身狗的梦 2024-09-06 08:31:25

您需要使用 as.Date() 将日期从字符转换为 Date 类型(如果您有更多信息,例如一天中的时间,则为 POSIX 类型)。然后,您可以与标准关系运算符(例如<=和>=)进行比较。

为此,您应该考虑使用诸如 zoo 之类的时间序列包。

编辑

只是为了回应您的评论,这里有一个将日期与现有向量一起使用的示例:

> as.Date(names(bar)) < as.Date("2001-10-14")
[1]  TRUE FALSE FALSE
> bar[as.Date(names(bar)) < as.Date("2001-10-14")]
1997-10-14 
         1

尽管您确实应该只使用时间序列包。下面介绍了如何使用 zoo(或 xtstimeSeriesfts 等)执行此操作

library(zoo)
ts <- zoo(c(1, 2, 1), as.Date(c("1997-10-14", "2001-10-14", "2007-10-14")))
ts[index(ts) < as.Date("2001-10-14"),]

:索引现在是 Date 类型,您可以根据需要进行任意多次比较。阅读zoo 小插图了解更多信息。

You need to convert your dates from characters into a Date type with as.Date() (or a POSIX type if you have more information like the time of day). Then you can make comparisons with standard relational operators such as <= and >=.

You should consider using a timeseries package such as zoo for this.

Edit:

Just to respond to your comment, here's an example of using dates with your existing vector:

> as.Date(names(bar)) < as.Date("2001-10-14")
[1]  TRUE FALSE FALSE
> bar[as.Date(names(bar)) < as.Date("2001-10-14")]
1997-10-14 
         1

Although you really should just use a time series package. Here's how you could do this with zoo (or xts, timeSeries, fts, etc.):

library(zoo)
ts <- zoo(c(1, 2, 1), as.Date(c("1997-10-14", "2001-10-14", "2007-10-14")))
ts[index(ts) < as.Date("2001-10-14"),]

Since the index is now a Date type, you can make as many comparisons as you want. Read the zoo vignette for more information.

旧时浪漫 2024-09-06 08:31:25

使用日期按词汇顺序排列的事实:

bar[names(bar) > "1995-01-01" & names(bar) < "2000-06-01"]
# 1997-10-14 
#          1 

bar[names(bar) > "2001-09-01" & names(bar) < "2007-11-04"]
# 2001-10-14 2007-10-14 
#          2          1 

结果被命名为向量(正如您原来的 bar 一样,它不是一个列表,它被命名为向量)。

正如 Dirk 在他的回答中所述,出于效率原因,最好使用 Date 。如果没有外部包,您可以重新排列数据并创建两个向量(或两列 data.frame),一个用于日期,一个用于值:

bar_dates <- as.Date(c("1997-10-14", "2001-10-14", "2007-10-14"))
bar_values <- c(1,2,1)

然后使用简单的索引:

bar_values[bar_dates > as.Date("1995-01-01") & bar_dates < as.Date("2000-06-01")]
# [1] 1

bar_values[bar_dates > as.Date("2001-09-01") & bar_dates < as.Date("2007-11-04")]
# [1] 2 1

Using fact that dates are in lexical order:

bar[names(bar) > "1995-01-01" & names(bar) < "2000-06-01"]
# 1997-10-14 
#          1 

bar[names(bar) > "2001-09-01" & names(bar) < "2007-11-04"]
# 2001-10-14 2007-10-14 
#          2          1 

Result is named vector (as you original bar, it's not a list it's named vector).

As Dirk states in his answer it's better to use Date for efficiency reasons. Without external packages you could rearrange you data and create two vectors (or two-column data.frame) one for dates, one for values:

bar_dates <- as.Date(c("1997-10-14", "2001-10-14", "2007-10-14"))
bar_values <- c(1,2,1)

then use simple indexing:

bar_values[bar_dates > as.Date("1995-01-01") & bar_dates < as.Date("2000-06-01")]
# [1] 1

bar_values[bar_dates > as.Date("2001-09-01") & bar_dates < as.Date("2007-11-04")]
# [1] 2 1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文