无法使用 as.Date 格式化月份
我遗漏了 as.Date 的“格式”部分中明显的一些内容。考虑这个例子
d1 <- data.frame(d = c("1/Jan/1947", “1947 年 2 月 1 日”, “1947 年 3 月 1 日”), d2 = c("1947 年 1 月", “1947 年 2 月”, "Mar/1947"))
d1$date1 <- as.Date(x=d1$d, format="%d/%b/%Y")
d1$date2 <- as.Date(x=d1$d2, format="%b/%Y")
d d2 date1 date2
1 1/Jan/1947 Jan/1947 1947-01-01 <NA>
2 1/Feb/1947 Feb/1947 1947-02-01 <NA>
3 1/Mar/1947 Mar/1947 1947-03-01 <NA>
所以我的问题非常简单——我不明白为什么 date1 有效但 date2 无效。
I'm missing something obvious with the "format" section of as.Date. Consider this example
d1 <- data.frame(d = c("1/Jan/1947",
"1/Feb/1947",
"1/Mar/1947"),
d2 = c("Jan/1947",
"Feb/1947",
"Mar/1947"))
d1$date1 <- as.Date(x=d1$d, format="%d/%b/%Y")
d1$date2 <- as.Date(x=d1$d2, format="%b/%Y")
d d2 date1 date2
1 1/Jan/1947 Jan/1947 1947-01-01 <NA>
2 1/Feb/1947 Feb/1947 1947-02-01 <NA>
3 1/Mar/1947 Mar/1947 1947-03-01 <NA>
so my question is really simple -- I don't understand why the date1 works but date2 doesn't.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
最简单的答案是日期是包含一天的内容,如果未指定日期,则 as.Date() 会感到困惑。来自 ?as.Date 文档:
仔细想想,严格来说,“Mar/1947”这样的术语并不是一个日期,它只是月份和年份的组合。日期是 1947 年 3 月(或任何其他月份+年份)中的特定日期 - 因为您没有指定日期,所以您没有日期。
The simplest answer is that a date is something which includes a day and if one is not specified, as.Date() gets confused. From the ?as.Date documentation:
When you think about it, a term such as "Mar/1947" is not, strictly speaking, a date - it's just a combination of month and year. A date is a specific day in March 1947 (or any other month + year) - since you don't specify one, you don't have a date.
这是因为
data.frame
中的d2
是格式错误的日期。它不包含一天。为了解决这个问题,请考虑使用以下方法:It is because
d2
in yourdata.frame
is a malformed date. It doesn't contain a day. To get round this, consider using the following:我不知道,但是 %b 当它是领先字段时似乎不起作用。
以下全部失败(给出 NA):
而当您在 %b 前面加上 %d 时,它会起作用:
似乎 neilfws 有关于不完整的答案。
这也可以解释为什么只给出年份给出:
I don't know, but %b doesn't seem to work when it's the leading field.
The following all fail (give NA):
whereas when you precede %b with %d, it works:
Seems like neilfws has the answer about incompleteness.
This would also explain why giving only the year gives:
根据 Cole Beck 的文档“Handling date-times in R”,日期在内部保存为单个数值,该数值计算自参考日期 1970-01-01 以来经过的天数。
示例:1970-01-31 将在内部保存为 30。
因此,回到问题,当给定输入日期中未提及某一天 (%d)(即不完整的日期)时,它无法存储该日期在内部,导致“警告消息:强制引入的 NA”
来源:http://biostat.mc.vanderbilt.edu/wiki/pub/ Main/ColeBeck/datestimes.pdf
As per the document,"Handling date-times in R" by Cole Beck, internally a date is saved as a single numeric value, which counts the number of days passed since a reference date, 1970-01-01.
Example: 1970-01-31 will be saved internally as 30.
So, coming back to the problem, when a day (%d) is not mentioned in the given input date (i.e., an incomplete date), it cannot store the date internally, resulting in "Warning message: NAs introduced by coercion"
Source: http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ColeBeck/datestimes.pdf