如何根据 R 中不同列中的日期/时间范围更改 1 列中的值

发布于 2025-01-16 23:56:47 字数 1415 浏览 0 评论 0原文

我有一个包含日期/时间列和包含一些数值的列的数据框。我想根据记录的日期/时间范围将一些数值更改为“N/A”。

这就是我的 dataframe 的样子,

df = structure(list(Date_Time_GMT_3 = structure(c(1592226000, 1592226900, 
                                                  1592227800, 1592228700, 1592229600, 1592230500), class = c("POSIXct", 
                                                                                                             "POSIXt"), tzone = "EST"), diff_20676892_AIR_X3lh = c(NA, 0.385999999999999, 
                                                                                                                                                                   0.193, 0.290000000000001, 0.385, 0.576000000000001), diff_20819828_B1LH_DOUBLE_CHECK = c(NA, 
                                                                                                                                                                                                                                                            0, 0, 0, 0.0949999999999989, 0)), row.names = c(NA, 6L), class = "data.frame")

如果 diff_20819828_B1LH_DOUBLE_CHECK 的所有值在 2020-06 之间,我想将它们更改为 N/A -15 08:30:002020-06-15 09:00:00

我尝试过这段代码

df[df$Date_Time_GMT_3 > "2020-06-15 08:30:00"| < "2020-06-15 09:00:00"] = "NA"

,但毫不奇怪,这不起作用。我该如何解决这个问题?

I have a dataframe with a DATE/TIME column and a column with some numeric values. I'd like to change some numeric values to "N/A" based of a range of DATE/TIME they are recorded at.

This is what my dataframe looks like

df = structure(list(Date_Time_GMT_3 = structure(c(1592226000, 1592226900, 
                                                  1592227800, 1592228700, 1592229600, 1592230500), class = c("POSIXct", 
                                                                                                             "POSIXt"), tzone = "EST"), diff_20676892_AIR_X3lh = c(NA, 0.385999999999999, 
                                                                                                                                                                   0.193, 0.290000000000001, 0.385, 0.576000000000001), diff_20819828_B1LH_DOUBLE_CHECK = c(NA, 
                                                                                                                                                                                                                                                            0, 0, 0, 0.0949999999999989, 0)), row.names = c(NA, 6L), class = "data.frame")

I want to change all values for diff_20819828_B1LH_DOUBLE_CHECK to N/A if they are between 2020-06-15 08:30:00 and 2020-06-15 09:00:00

I tried this code

df[df$Date_Time_GMT_3 > "2020-06-15 08:30:00"| < "2020-06-15 09:00:00"] = "NA"

but to no surprise this doesn't work. How can I fix this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

追星践月 2025-01-23 23:56:47

您的日期列位于“EST”中,因此您可以执行以下操作:

df[df$Date_Time_GMT_3 > as.POSIXct("2020-06-15 08:30:00", tz="EST") &
     df$Date_Time_GMT_3 < as.POSIXct("2020-06-15 09:00:00", tz="EST"),3] <- NA

      Date_Time_GMT_3 diff_20676892_AIR_X3lh diff_20819828_B1LH_DOUBLE_CHECK
1 2020-06-15 08:00:00                     NA                              NA
2 2020-06-15 08:15:00                  0.386                           0.000
3 2020-06-15 08:30:00                  0.193                           0.000
4 2020-06-15 08:45:00                  0.290                              NA
5 2020-06-15 09:00:00                  0.385                           0.095
6 2020-06-15 09:15:00                  0.576                           0.000

请注意,这些时间之间只有一行,第 4 行及上方将此类行的第三列中的值更改为 NA

Your date column is in "EST", so you can do this:

df[df$Date_Time_GMT_3 > as.POSIXct("2020-06-15 08:30:00", tz="EST") &
     df$Date_Time_GMT_3 < as.POSIXct("2020-06-15 09:00:00", tz="EST"),3] <- NA

      Date_Time_GMT_3 diff_20676892_AIR_X3lh diff_20819828_B1LH_DOUBLE_CHECK
1 2020-06-15 08:00:00                     NA                              NA
2 2020-06-15 08:15:00                  0.386                           0.000
3 2020-06-15 08:30:00                  0.193                           0.000
4 2020-06-15 08:45:00                  0.290                              NA
5 2020-06-15 09:00:00                  0.385                           0.095
6 2020-06-15 09:15:00                  0.576                           0.000

Note that there is only one row between those times, row 4, and above changes the value(s) in the 3rd column for such row(s) to NA

雪落纷纷 2025-01-23 23:56:47

您的基本 R 代码不起作用,因为

  1. 您没有指定应更改哪一列的值
  2. 您使用的是 | 而不是 &
  3. 在逻辑运算符之后需要重复评估哪个向量
  4. 您没有告诉 R 这些字符串是日期时间。

Langtang的解决方案非常简洁。使用 dplyr 和 lubridate 的另一个选项是:

library(dplyr)
library(lubridate)
df %>% mutate(diff_20819828_B1LH_DOUBLE_CHECK = na_if(
  diff_20819828_B1LH_DOUBLE_CHECK,
  Date_Time_GMT_3 %within% interval("2020-06-15 08:30:00", "2020-06-15 09:00:00")
))

Your base R code isn't working because

  1. You didn't specify which column's values should be changed
  2. You're using an | instead of an &
  3. After a logical operator you need to repeat which vector to assess
  4. You're not telling R that those strings are date-times.

Langtang's solution is very neat. Another option using dplyr and lubridate is:

library(dplyr)
library(lubridate)
df %>% mutate(diff_20819828_B1LH_DOUBLE_CHECK = na_if(
  diff_20819828_B1LH_DOUBLE_CHECK,
  Date_Time_GMT_3 %within% interval("2020-06-15 08:30:00", "2020-06-15 09:00:00")
))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文