如何防止 ifelse() 将 Date 对象转换为数字对象

发布于 2024-11-19 20:04:46 字数 672 浏览 1 评论 0原文

我正在使用函数 ifelse() 来操作日期向量。我预计结果是 Date 类,但很惊讶地得到了 numeric 向量。这是一个示例:

dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05'))
dates <- ifelse(dates == '2011-01-01', dates - 1, dates)
str(dates)

这尤其令人惊讶,因为在整个向量上执行操作会返回一个 Date 对象。

dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04','2011-01-05'))
dates <- dates - 1
str(dates)

我应该使用其他函数来操作 Date 向量吗?如果有的话,有什么功能?如果不是,我如何强制 ifelse 返回与输入相同类型的向量?

ifelse 的帮助页面表明这是一个功能,而不是一个错误,但我仍在努力寻找对我发现的令人惊讶的行为的解释。

I am using the function ifelse() to manipulate a date vector. I expected the result to be of class Date, and was surprised to get a numeric vector instead. Here is an example:

dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05'))
dates <- ifelse(dates == '2011-01-01', dates - 1, dates)
str(dates)

This is especially surprising because performing the operation across the entire vector returns a Date object.

dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04','2011-01-05'))
dates <- dates - 1
str(dates)

Should I be using some other function to operate on Date vectors? If so, what function? If not, how do I force ifelse to return a vector of the same type as the input?

The help page for ifelse indicates that this is a feature, not a bug, but I'm still struggling to find an explanation for what I found to be surprising behavior.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

幸福%小乖 2024-11-26 20:04:46

您可以使用 data.table::fifelse (data.table >= 1.12.3) 或 dplyr::if_else


data.table::fifelse

ifelse 不同,fielse 保留输入的类型和类。

library(data.table)
dates <- fifelse(dates == '2011-01-01', dates - 1, dates)
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

dplyr::if_else

来自 dplyr 0.5.0 发行说明

[if_else] 具有比 ifelse() 更严格的语义:truefalse 参数必须是相同类型。这给出了一个不太令人惊讶的返回类型,并保留了 S3 向量,如日期"。

library(dplyr)
dates <- if_else(dates == '2011-01-01', dates - 1, dates)
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05" 

You may use data.table::fifelse (data.table >= 1.12.3) or dplyr::if_else.


data.table::fifelse

Unlike ifelse, fifelse preserves the type and class of the inputs.

library(data.table)
dates <- fifelse(dates == '2011-01-01', dates - 1, dates)
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

dplyr::if_else

From dplyr 0.5.0 release notes:

[if_else] have stricter semantics that ifelse(): the true and false arguments must be the same type. This gives a less surprising return type, and preserves S3 vectors like dates" .

library(dplyr)
dates <- if_else(dates == '2011-01-01', dates - 1, dates)
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05" 
给妤﹃绝世温柔 2024-11-26 20:04:46

它与 ifelse 的记录Value相关:

test 具有相同长度和属性(包括维度和“class”)的向量以及来自 yes 值的数据值或。答案的模式将从逻辑上进行强制,以首先适应从 yes 获取的任何值,然后适应从 no 获取的任何值。

归结为它的含义,ifelse 使因子失去其级别,日期失去其类别,并且仅恢复其模式(“数字”)。试试这个:

dates[dates == '2011-01-01'] <- dates[dates == '2011-01-01'] - 1
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

你可以创建一个 safe.ifelse

safe.ifelse <- function(cond, yes, no){ class.y <- class(yes)
                                  X <- ifelse(cond, yes, no)
                                  class(X) <- class.y; return(X)}

safe.ifelse(dates == '2011-01-01', dates - 1, dates)
# [1] "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

稍后的注释:我看到 Hadley 在 magrittr/dplyr/tidyr 数据复合体中构建了一个 if_else -塑造确实保留结果类的包。

It relates to the documented Value of ifelse:

A vector of the same length and attributes (including dimensions and "class") as test and data values from the values of yes or no. The mode of the answer will be coerced from logical to accommodate first any values taken from yes and then any values taken from no.

Boiled down to its implications, ifelse makes factors lose their levels and Dates lose their class and only their mode ("numeric") is restored. Try this instead:

dates[dates == '2011-01-01'] <- dates[dates == '2011-01-01'] - 1
str(dates)
# Date[1:5], format: "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

You could create a safe.ifelse:

safe.ifelse <- function(cond, yes, no){ class.y <- class(yes)
                                  X <- ifelse(cond, yes, no)
                                  class(X) <- class.y; return(X)}

safe.ifelse(dates == '2011-01-01', dates - 1, dates)
# [1] "2010-12-31" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

A later note: I see that Hadley has built an if_else into the the magrittr/dplyr/tidyr complex of data-shaping packages that does preserve the class of the consequent.

水中月 2024-11-26 20:04:46

DWin 的解释很到位。在我意识到我可以简单地在 ifelse 语句之后强行强制上课之前,我摆弄并与之斗争了一段时间:

dates <- as.Date(c('2011-01-01','2011-01-02','2011-01-03','2011-01-04','2011-01-05'))
dates <- ifelse(dates=='2011-01-01',dates-1,dates)
str(dates)
class(dates)<- "Date"
str(dates)

起初,这对我来说有点“hackish”。但现在我只是认为这是为我从 ifelse() 获得的性能回报付出的一个小小的代价。而且它仍然比循环简洁得多。

DWin's explanation is spot on. I fiddled and fought with this for a while before I realized I could simply force the class after the ifelse statement:

dates <- as.Date(c('2011-01-01','2011-01-02','2011-01-03','2011-01-04','2011-01-05'))
dates <- ifelse(dates=='2011-01-01',dates-1,dates)
str(dates)
class(dates)<- "Date"
str(dates)

At first this felt a little "hackish" to me. But now I just think of it as a small price to pay for the performance returns that I get from ifelse(). Plus it's still a lot more concise than a loop.

所有深爱都是秘密 2024-11-26 20:04:46

这不起作用的原因是 ifelse() 函数将值转换为因子。一个很好的解决方法是在评估之前将其转换为字符。

dates <- as.Date(c('2011-01-01',
                   '2011-01-02',
                   '2011-01-03',
                   '2011-01-04',
                   '2011-01-05'))
dates_new <- dates - 1
dates <- as.Date(ifelse(dates == '2011-01-01',
                        as.character(dates_new),
                        as.character(dates)))

除了基础 R 之外,这不需要任何库。

The reason why this won't work is because, ifelse() function converts the values to factors. A nice workaround would be to convert it to characters before evaluating it.

dates <- as.Date(c('2011-01-01',
                   '2011-01-02',
                   '2011-01-03',
                   '2011-01-04',
                   '2011-01-05'))
dates_new <- dates - 1
dates <- as.Date(ifelse(dates == '2011-01-01',
                        as.character(dates_new),
                        as.character(dates)))

This wouldn't require any library apart from base R.

半衬遮猫 2024-11-26 20:04:46

建议的方法不适用于因子列。我想建议这种改进:

safe.ifelse <- function(cond, yes, no) {
  class.y <- class(yes)
  if (class.y == "factor") {
    levels.y = levels(yes)
  }
  X <- ifelse(cond,yes,no)
  if (class.y == "factor") {
    X = as.factor(X)
    levels(X) = levels.y
  } else {
    class(X) <- class.y
  }
  return(X)
}

顺便说一句:ifelse很糟糕...能力越大,责任越大,即1x1矩阵和/或数字的类型转换[例如,当它们应该被添加时]对我来说没问题,但这种类型转换ifelse 显然是不需要的。我现在多次遇到 ifelse 的同一个“错误”,它只是不断地窃取我的时间:-(

FW

The suggested method does not work with factor columns. Id like to suggest this improvement:

safe.ifelse <- function(cond, yes, no) {
  class.y <- class(yes)
  if (class.y == "factor") {
    levels.y = levels(yes)
  }
  X <- ifelse(cond,yes,no)
  if (class.y == "factor") {
    X = as.factor(X)
    levels(X) = levels.y
  } else {
    class(X) <- class.y
  }
  return(X)
}

By the way: ifelse sucks... with great power comes great responsibility, i.e. type conversions of 1x1 matrices and/or numerics [when they should be added for example] is ok to me but this type conversion in ifelse is clearly unwanted. I bumped into the very same 'bug' of ifelse multiple times now and it just keeps on stealing my time :-(

FW

已下线请稍等 2024-11-26 20:04:46

@fabian-werner 提供的答案很好,但是对象可以有多个类,并且“factor”不一定是 class(yes) 返回的第一个类,所以我建议这个小修改检查所有类属性:

safe.ifelse <- function(cond, yes, no) {
      class.y <- class(yes)
      if ("factor" %in% class.y) {  # Note the small condition change here
        levels.y = levels(yes)
      }
      X <- ifelse(cond,yes,no)
      if ("factor" %in% class.y) {  # Note the small condition change here
        X = as.factor(X)
        levels(X) = levels.y
      } else {
        class(X) <- class.y
      }
      return(X)
    }

我还向 R 开发团队提交了请求,要求添加一个记录选项,以便让 base::ifelse() 根据用户选择要保留的属性来保留属性。请求位于:https://bugs.r-project.org/ bugzilla/show_bug.cgi?id=16609 - 它已经被标记为“WONTFIX”,因为它一直都是现在的样子,但我提供了一个后续论点为什么简单的添加可以让很多 R 用户省去麻烦。也许您在该错误线程中的“+1”会鼓励 R Core 团队重新审视。

编辑:这是一个更好的版本,允许用户指定要保留的属性,“cond”(默认 ifelse() 行为)、“yes”(按照上面代码的行为)或“no”(对于以下情况) “no”值的属性更好:

safe_ifelse <- function(cond, yes, no, preserved_attributes = "yes") {
    # Capture the user's choice for which attributes to preserve in return value
    preserved           <- switch(EXPR = preserved_attributes, "cond" = cond,
                                                               "yes"  = yes,
                                                               "no"   = no);
    # Preserve the desired values and check if object is a factor
    preserved_class     <- class(preserved);
    preserved_levels    <- levels(preserved);
    preserved_is_factor <- "factor" %in% preserved_class;

    # We have to use base::ifelse() for its vectorized properties
    # If we do our own if() {} else {}, then it will only work on first variable in a list
    return_obj <- ifelse(cond, yes, no);

    # If the object whose attributes we want to retain is a factor
    # Typecast the return object as.factor()
    # Set its levels()
    # Then check to see if it's also one or more classes in addition to "factor"
    # If so, set the classes, which will preserve "factor" too
    if (preserved_is_factor) {
        return_obj          <- as.factor(return_obj);
        levels(return_obj)  <- preserved_levels;
        if (length(preserved_class) > 1) {
          class(return_obj) <- preserved_class;
        }
    }
    # In all cases we want to preserve the class of the chosen object, so set it here
    else {
        class(return_obj)   <- preserved_class;
    }
    return(return_obj);

} # End safe_ifelse function

The answer provided by @fabian-werner is great, but objects can have multiple classes, and "factor" may not necessarily be the first one returned by class(yes), so I suggest this small modification to check all class attributes:

safe.ifelse <- function(cond, yes, no) {
      class.y <- class(yes)
      if ("factor" %in% class.y) {  # Note the small condition change here
        levels.y = levels(yes)
      }
      X <- ifelse(cond,yes,no)
      if ("factor" %in% class.y) {  # Note the small condition change here
        X = as.factor(X)
        levels(X) = levels.y
      } else {
        class(X) <- class.y
      }
      return(X)
    }

I have also submitted a request with the R Development team to add a documented option to have base::ifelse() preserve attributes based on user selection of which attributes to preserve. The request is here: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16609 - It has already been flagged as "WONTFIX" on the grounds that it has always been the way it is now, but I have provided a follow-up argument on why a simple addition might save a lot of R users headaches. Perhaps your "+1" in that bug thread will encourage the R Core team to take a second look.

EDIT: Here's a better version that allows the user to specify which attributes to preserve, either "cond" (default ifelse() behaviour), "yes", the behaviour as per the code above, or "no", for cases where the attributes of the "no" value are better:

safe_ifelse <- function(cond, yes, no, preserved_attributes = "yes") {
    # Capture the user's choice for which attributes to preserve in return value
    preserved           <- switch(EXPR = preserved_attributes, "cond" = cond,
                                                               "yes"  = yes,
                                                               "no"   = no);
    # Preserve the desired values and check if object is a factor
    preserved_class     <- class(preserved);
    preserved_levels    <- levels(preserved);
    preserved_is_factor <- "factor" %in% preserved_class;

    # We have to use base::ifelse() for its vectorized properties
    # If we do our own if() {} else {}, then it will only work on first variable in a list
    return_obj <- ifelse(cond, yes, no);

    # If the object whose attributes we want to retain is a factor
    # Typecast the return object as.factor()
    # Set its levels()
    # Then check to see if it's also one or more classes in addition to "factor"
    # If so, set the classes, which will preserve "factor" too
    if (preserved_is_factor) {
        return_obj          <- as.factor(return_obj);
        levels(return_obj)  <- preserved_levels;
        if (length(preserved_class) > 1) {
          class(return_obj) <- preserved_class;
        }
    }
    # In all cases we want to preserve the class of the chosen object, so set it here
    else {
        class(return_obj)   <- preserved_class;
    }
    return(return_obj);

} # End safe_ifelse function
天赋异禀 2024-11-26 20:04:46

为什么不在这里使用索引呢?

> dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05'))
> dates[dates == '2011-01-01'] <- NA
> str(dates)
 Date[1:5], format: NA "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"

Why not use indexing here?

> dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05'))
> dates[dates == '2011-01-01'] <- NA
> str(dates)
 Date[1:5], format: NA "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文