通过从日期时间中过滤 4 小时间隔来创建数据集

发布于 2025-01-11 02:23:20 字数 1384 浏览 0 评论 0原文

我希望创建一个数据集,其中仅包含相隔 4 小时的每次观察的数据。 目前我拥有一个每小时观察的数据集。 划分不起作用,因为样本之间没有 4 小时的间隔。

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13 17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13 18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13 20:00:45     False       2017     1    13    20 

我想要的结果是一个可以创建与此类似的公式:

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
3 M_15_03   2017-01-13 23:00:09     False       2017     1    13    23
4 M_15_03   2017-01-14 03:00:42     False       2017     1    14    03
5 M_15_03   2017-01-14 07:00:14     False       2017     1    14    07
6 M_15_03   2017-01-14 11:00:45     True        2017     1    14    11  

I wish to create a dataset that only has the data from every observation 4 hours apart.
Currently I possess a dataset that has hourly observations.
Dividing does not work because the samples would not have 4 hour intervals between them.

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13 17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13 18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13 20:00:45     False       2017     1    13    20 

The result I would be looking for is a formula that would create something similar to this:

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
3 M_15_03   2017-01-13 23:00:09     False       2017     1    13    23
4 M_15_03   2017-01-14 03:00:42     False       2017     1    14    03
5 M_15_03   2017-01-14 07:00:14     False       2017     1    14    07
6 M_15_03   2017-01-14 11:00:45     True        2017     1    14    11  

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

清风不识月 2025-01-18 02:23:20

您可以使用lubridate::floor_datefloor_date 设置为 4 小时 将在 16 - 20 - 24 小时内四舍五入,因此您可以使用 Lead 来设置 15 - 19,并填充 tidyr::填充

library(lubridate)
library(tidyr)
library(dplyr)

data %>% 
  mutate(fl = floor_date(lead(DateTimeLMT), unit = "4 hours")) %>% 
  fill(fl) %>% 
  group_by(IndividId, fl) %>% 
  summarise(across(c(DateTimeLMT, YEAR:hour), first),
            isDaylight = any(isDaylight == "True"))

# A tibble: 2 x 8
# Groups:   IndividId [1]
  IndividId fl                  DateTimeLMT          YEAR month   day  hour isDaylight
  <chr>     <dttm>              <dttm>              <int> <int> <int> <int> <lgl>     
1 M_15_03   2017-01-13 16:00:00 2017-01-13 15:00:42  2017     1    13    15 TRUE      
2 M_15_03   2017-01-13 20:00:00 2017-01-13 19:00:14  2017     1    13    19 FALSE     

数据

data <- read.table(header = T, text = "IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
1 M_15_03   2017-01-13-15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13-16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13-17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13-18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13-19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13-20:00:45     False       2017     1    13    20 ")

You can use lubridate::floor_date. floor_date set to 4 hours will round at 16 - 20 - 24 hours, so you can use lead to make to 15 - 19, and fill with tidyr::fill.

library(lubridate)
library(tidyr)
library(dplyr)

data %>% 
  mutate(fl = floor_date(lead(DateTimeLMT), unit = "4 hours")) %>% 
  fill(fl) %>% 
  group_by(IndividId, fl) %>% 
  summarise(across(c(DateTimeLMT, YEAR:hour), first),
            isDaylight = any(isDaylight == "True"))

# A tibble: 2 x 8
# Groups:   IndividId [1]
  IndividId fl                  DateTimeLMT          YEAR month   day  hour isDaylight
  <chr>     <dttm>              <dttm>              <int> <int> <int> <int> <lgl>     
1 M_15_03   2017-01-13 16:00:00 2017-01-13 15:00:42  2017     1    13    15 TRUE      
2 M_15_03   2017-01-13 20:00:00 2017-01-13 19:00:14  2017     1    13    19 FALSE     

data

data <- read.table(header = T, text = "IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
1 M_15_03   2017-01-13-15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13-16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13-17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13-18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13-19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13-20:00:45     False       2017     1    13    20 ")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文