通过从日期时间中过滤 4 小时间隔来创建数据集

发布于 2025-01-11 02:23:20 字数 1384 浏览 0 评论 0原文

我希望创建一个数据集，其中仅包含相隔 4 小时的每次观察的数据。目前我拥有一个每小时观察的数据集。划分不起作用，因为样本之间没有 4 小时的间隔。

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13 17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13 18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13 20:00:45     False       2017     1    13    20

我想要的结果是一个可以创建与此类似的公式：

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
3 M_15_03   2017-01-13 23:00:09     False       2017     1    13    23
4 M_15_03   2017-01-14 03:00:42     False       2017     1    14    03
5 M_15_03   2017-01-14 07:00:14     False       2017     1    14    07
6 M_15_03   2017-01-14 11:00:45     True        2017     1    14    11

原文

I wish to create a dataset that only has the data from every observation 4 hours apart.
Currently I possess a dataset that has hourly observations.
Dividing does not work because the samples would not have 4 hour intervals between them.

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13 17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13 18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13 20:00:45     False       2017     1    13    20

The result I would be looking for is a formula that would create something similar to this:

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
3 M_15_03   2017-01-13 23:00:09     False       2017     1    13    23
4 M_15_03   2017-01-14 03:00:42     False       2017     1    14    03
5 M_15_03   2017-01-14 07:00:14     False       2017     1    14    07
6 M_15_03   2017-01-14 11:00:45     True        2017     1    14    11

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清风不识月 2025-01-18 02:23:20

您可以使用lubridate::floor_date。 floor_date 设置为 4 小时 将在 16 - 20 - 24 小时内四舍五入，因此您可以使用 Lead 来设置 15 - 19，并填充 tidyr::填充。

library(lubridate)
library(tidyr)
library(dplyr)

data %>% 
  mutate(fl = floor_date(lead(DateTimeLMT), unit = "4 hours")) %>% 
  fill(fl) %>% 
  group_by(IndividId, fl) %>% 
  summarise(across(c(DateTimeLMT, YEAR:hour), first),
            isDaylight = any(isDaylight == "True"))

# A tibble: 2 x 8
# Groups:   IndividId [1]
  IndividId fl                  DateTimeLMT          YEAR month   day  hour isDaylight
  <chr>     <dttm>              <dttm>              <int> <int> <int> <int> <lgl>     
1 M_15_03   2017-01-13 16:00:00 2017-01-13 15:00:42  2017     1    13    15 TRUE      
2 M_15_03   2017-01-13 20:00:00 2017-01-13 19:00:14  2017     1    13    19 FALSE

数据

data <- read.table(header = T, text = "IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
1 M_15_03   2017-01-13-15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13-16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13-17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13-18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13-19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13-20:00:45     False       2017     1    13    20 ")

You can use lubridate::floor_date. floor_date set to 4 hours will round at 16 - 20 - 24 hours, so you can use lead to make to 15 - 19, and fill with tidyr::fill.

library(lubridate)
library(tidyr)
library(dplyr)

data %>% 
  mutate(fl = floor_date(lead(DateTimeLMT), unit = "4 hours")) %>% 
  fill(fl) %>% 
  group_by(IndividId, fl) %>% 
  summarise(across(c(DateTimeLMT, YEAR:hour), first),
            isDaylight = any(isDaylight == "True"))

# A tibble: 2 x 8
# Groups:   IndividId [1]
  IndividId fl                  DateTimeLMT          YEAR month   day  hour isDaylight
  <chr>     <dttm>              <dttm>              <int> <int> <int> <int> <lgl>     
1 M_15_03   2017-01-13 16:00:00 2017-01-13 15:00:42  2017     1    13    15 TRUE      
2 M_15_03   2017-01-13 20:00:00 2017-01-13 19:00:14  2017     1    13    19 FALSE

data

data <- read.table(header = T, text = "IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
1 M_15_03   2017-01-13-15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13-16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13-17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13-18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13-19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13-20:00:45     False       2017     1    13    20 ")

回复收藏 0 原文

~没有更多了~