用时间桶对 Python 中的数据进行分组

发布于 2025-01-11 13:35:58 字数 1437 浏览 4 评论 0原文

我想有一定的时间段，然后找到每个时间段的差异来分析。

例如，

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'A-1 A-1 A-1 A-1 A-1 A-1'.split(),
                   'Date':'23.10.2021 23.10.2021 23.10.2021 23.10.2021 23.10.2021 23.10.2021'.split(),
                   'Time': '06:05:31 06:11:13 06:19:22 06:25:03 06:33:12 06:44:05'.split(),
                   'Cumulative': '12 17 19 23 29 38'.split()})
print(df)

我

     A        Date      Time Cumulative
0  A-1  23.10.2021  06:05:31         12
1  A-1  23.10.2021  06:11:13         17
2  A-1  23.10.2021  06:19:22         19
3  A-1  23.10.2021  06:25:03         23
4  A-1  23.10.2021  06:33:12         29
5  A-1  23.10.2021  06:44:05         38

想要的是以 15 分钟为间隔的时间上限，并找出每个时间间隔的差异，第一步：

     A        Date      Time Cumulative      TimeBuckets
0  A-1  23.10.2021  06:05:31         12         06:15:00 
1  A-1  23.10.2021  06:11:13         17         06:15:00 
2  A-1  23.10.2021  06:19:22         19         06:30:00 
3  A-1  23.10.2021  06:25:03         23         06:30:00 
4  A-1  23.10.2021  06:33:12         29         06:45:00 
5  A-1  23.10.2021  06:44:05         38         06:45:00

在最后阶段作为不同的数据帧，每个时间桶的每个最小值和最大值的差异将被写入：

     A         Diff   TimeBuckets
0  A-1            5      06:15:00  
1  A-1            4      06:30:00    
2  A-1            8      06:45:00

原文

I would like to have a certain time buckets and then find the difference of each time bucket to analyse.

For example,

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'A-1 A-1 A-1 A-1 A-1 A-1'.split(),
                   'Date':'23.10.2021 23.10.2021 23.10.2021 23.10.2021 23.10.2021 23.10.2021'.split(),
                   'Time': '06:05:31 06:11:13 06:19:22 06:25:03 06:33:12 06:44:05'.split(),
                   'Cumulative': '12 17 19 23 29 38'.split()})
print(df)

out:

     A        Date      Time Cumulative
0  A-1  23.10.2021  06:05:31         12
1  A-1  23.10.2021  06:11:13         17
2  A-1  23.10.2021  06:19:22         19
3  A-1  23.10.2021  06:25:03         23
4  A-1  23.10.2021  06:33:12         29
5  A-1  23.10.2021  06:44:05         38

What I'd like to have is ceiling the hours by 15 mins intervals and find the difference of each,
1st Step:

     A        Date      Time Cumulative      TimeBuckets
0  A-1  23.10.2021  06:05:31         12         06:15:00 
1  A-1  23.10.2021  06:11:13         17         06:15:00 
2  A-1  23.10.2021  06:19:22         19         06:30:00 
3  A-1  23.10.2021  06:25:03         23         06:30:00 
4  A-1  23.10.2021  06:33:12         29         06:45:00 
5  A-1  23.10.2021  06:44:05         38         06:45:00

and in final stage as a different dataframe, difference of each minimum and maximum value for each time bucket would be written:

     A         Diff   TimeBuckets
0  A-1            5      06:15:00  
1  A-1            4      06:30:00    
2  A-1            8      06:45:00

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你的笑 2025-01-18 13:35:58

IIUC，您可以使用 dt.ceil< /code>和 GroupBy.agg：

(df.assign(Cumulative=df['Cumulative'].astype(int),
           TimeBuckets=pd.to_datetime(df['Time']).dt.ceil('15min').dt.time
          )
   .groupby('TimeBuckets', as_index=False)
   .agg({'A': 'first', 'Cumulative': lambda x: x.max()-x.min()})
)

输出：

  TimeBuckets    A  Cumulative
0    06:15:00  A-1           5
1    06:30:00  A-1           4
2    06:45:00  A-1           9

IIUC, you could use dt.ceil and GroupBy.agg:

(df.assign(Cumulative=df['Cumulative'].astype(int),
           TimeBuckets=pd.to_datetime(df['Time']).dt.ceil('15min').dt.time
          )
   .groupby('TimeBuckets', as_index=False)
   .agg({'A': 'first', 'Cumulative': lambda x: x.max()-x.min()})
)

output:

  TimeBuckets    A  Cumulative
0    06:15:00  A-1           5
1    06:30:00  A-1           4
2    06:45:00  A-1           9

回复收藏 0 原文

~没有更多了~

关于作者

我不在是我

暂无简介

文章

27 人气

关注发私信

李珊平

文章 0 评论 0

关注

Quxin

文章 0 评论 0

关注

范无咎

文章 0 评论 0

关注

github_ZOJ2N8YxBm

文章 0 评论 0

关注

若言

文章 0 评论 0

关注

南…巷孤猫

文章 0 评论 0

友情链接

文江博客

用时间桶对 Python 中的数据进行分组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

用时间桶对 Python 中的数据进行分组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。