如何根据 R 中不同列中的日期/时间范围更改 1 列中的值

发布于 2025-01-16 23:56:47 字数 1415 浏览 0 评论 0原文

我有一个包含日期/时间列和包含一些数值的列的数据框。我想根据记录的日期/时间范围将一些数值更改为“N/A”。

这就是我的 dataframe 的样子，

df = structure(list(Date_Time_GMT_3 = structure(c(1592226000, 1592226900, 
                                                  1592227800, 1592228700, 1592229600, 1592230500), class = c("POSIXct", 
                                                                                                             "POSIXt"), tzone = "EST"), diff_20676892_AIR_X3lh = c(NA, 0.385999999999999, 
                                                                                                                                                                   0.193, 0.290000000000001, 0.385, 0.576000000000001), diff_20819828_B1LH_DOUBLE_CHECK = c(NA, 
                                                                                                                                                                                                                                                            0, 0, 0, 0.0949999999999989, 0)), row.names = c(NA, 6L), class = "data.frame")

如果 diff_20819828_B1LH_DOUBLE_CHECK 的所有值在 2020-06 之间，我想将它们更改为 N/A -15 08:30:00 和 2020-06-15 09:00:00

我尝试过这段代码

df[df$Date_Time_GMT_3 > "2020-06-15 08:30:00"| < "2020-06-15 09:00:00"] = "NA"

，但毫不奇怪，这不起作用。我该如何解决这个问题？

原文

I have a dataframe with a DATE/TIME column and a column with some numeric values. I'd like to change some numeric values to "N/A" based of a range of DATE/TIME they are recorded at.

This is what my dataframe looks like

df = structure(list(Date_Time_GMT_3 = structure(c(1592226000, 1592226900, 
                                                  1592227800, 1592228700, 1592229600, 1592230500), class = c("POSIXct", 
                                                                                                             "POSIXt"), tzone = "EST"), diff_20676892_AIR_X3lh = c(NA, 0.385999999999999, 
                                                                                                                                                                   0.193, 0.290000000000001, 0.385, 0.576000000000001), diff_20819828_B1LH_DOUBLE_CHECK = c(NA, 
                                                                                                                                                                                                                                                            0, 0, 0, 0.0949999999999989, 0)), row.names = c(NA, 6L), class = "data.frame")

I want to change all values for diff_20819828_B1LH_DOUBLE_CHECK to N/A if they are between 2020-06-15 08:30:00 and 2020-06-15 09:00:00

I tried this code

df[df$Date_Time_GMT_3 > "2020-06-15 08:30:00"| < "2020-06-15 09:00:00"] = "NA"

but to no surprise this doesn't work. How can I fix this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

追星践月 2025-01-23 23:56:47

您的日期列位于“EST”中，因此您可以执行以下操作：

df[df$Date_Time_GMT_3 > as.POSIXct("2020-06-15 08:30:00", tz="EST") &
     df$Date_Time_GMT_3 < as.POSIXct("2020-06-15 09:00:00", tz="EST"),3] <- NA

      Date_Time_GMT_3 diff_20676892_AIR_X3lh diff_20819828_B1LH_DOUBLE_CHECK
1 2020-06-15 08:00:00                     NA                              NA
2 2020-06-15 08:15:00                  0.386                           0.000
3 2020-06-15 08:30:00                  0.193                           0.000
4 2020-06-15 08:45:00                  0.290                              NA
5 2020-06-15 09:00:00                  0.385                           0.095
6 2020-06-15 09:15:00                  0.576                           0.000

请注意，这些时间之间只有一行，第 4 行及上方将此类行的第三列中的值更改为 NA

Your date column is in "EST", so you can do this:

df[df$Date_Time_GMT_3 > as.POSIXct("2020-06-15 08:30:00", tz="EST") &
     df$Date_Time_GMT_3 < as.POSIXct("2020-06-15 09:00:00", tz="EST"),3] <- NA

      Date_Time_GMT_3 diff_20676892_AIR_X3lh diff_20819828_B1LH_DOUBLE_CHECK
1 2020-06-15 08:00:00                     NA                              NA
2 2020-06-15 08:15:00                  0.386                           0.000
3 2020-06-15 08:30:00                  0.193                           0.000
4 2020-06-15 08:45:00                  0.290                              NA
5 2020-06-15 09:00:00                  0.385                           0.095
6 2020-06-15 09:15:00                  0.576                           0.000

Note that there is only one row between those times, row 4, and above changes the value(s) in the 3rd column for such row(s) to NA

回复收藏 0 原文

雪落纷纷 2025-01-23 23:56:47

您的基本 R 代码不起作用，因为

您没有指定应更改哪一列的值
您使用的是 | 而不是 &
在逻辑运算符之后需要重复评估哪个向量
您没有告诉 R 这些字符串是日期时间。

Langtang的解决方案非常简洁。使用 dplyr 和 lubridate 的另一个选项是：

library(dplyr)
library(lubridate)
df %>% mutate(diff_20819828_B1LH_DOUBLE_CHECK = na_if(
  diff_20819828_B1LH_DOUBLE_CHECK,
  Date_Time_GMT_3 %within% interval("2020-06-15 08:30:00", "2020-06-15 09:00:00")
))

Your base R code isn't working because

You didn't specify which column's values should be changed
You're using an | instead of an &
After a logical operator you need to repeat which vector to assess
You're not telling R that those strings are date-times.

Langtang's solution is very neat. Another option using dplyr and lubridate is:

library(dplyr)
library(lubridate)
df %>% mutate(diff_20819828_B1LH_DOUBLE_CHECK = na_if(
  diff_20819828_B1LH_DOUBLE_CHECK,
  Date_Time_GMT_3 %within% interval("2020-06-15 08:30:00", "2020-06-15 09:00:00")
))

回复收藏 0 原文

~没有更多了~