pandas 填充数据框中给定的缺失时间间隔
我有一个数据框,如下所示:
gap_idspeciestime_starttime_stop1wheat2021-11-22 | : | 002fescue2021-12-1805 | : |
---|---|---|---|
00 | 01 : | : | 002021-11-2200: |
52 | 002021-12-18 | : | 03 05:53:00 |
我想扩展 DataFrame 以便我对于每个 gap_id,获取与 time_start 和 time_stop 之间的分钟数一样多的行:
gap_id | 物种 | 时间 |
---|---|---|
1 | 小麦 | 2021-11-22 00: 01:00 |
1 | 小麦 | 2021-11-22 00:02:00 |
1 | 小麦 | 2021-11-22 00:03:00 |
2 | fescue | 2021-12-18 05:52:00 |
2 | fescue | 2021-12-18 05:53:00 |
我尝试过方法 pd.data_range
但我不知道如何将它与在 gap_id 上制作的 groupby
提前致谢
I have a DataFrame looking like:
gap_id | species | time_start | time_stop |
---|---|---|---|
1 | wheat | 2021-11-22 00:01:00 | 2021-11-22 00:03:00 |
2 | fescue | 2021-12-18 05:52:00 | 2021-12-18 05:53:00 |
I would like to expand the DataFrame such that I get as many rows as the number of minutes between time_start and time_stop for each gap_id:
gap_id | species | time |
---|---|---|
1 | wheat | 2021-11-22 00:01:00 |
1 | wheat | 2021-11-22 00:02:00 |
1 | wheat | 2021-11-22 00:03:00 |
2 | fescue | 2021-12-18 05:52:00 |
2 | fescue | 2021-12-18 05:53:00 |
I've tried the method pd.data_range
but I don't know how to couple it with a groupby
made on gap_id
Thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果小 DataFrame 和性能并不重要,则为每行生成
date_range
然后使用DataFrame.explode
:对于大型 DataFrame,首先以分钟为单位按差异
start
和stop
列重复索引,然后按GroupBy.cumcount
通过
to_timedelta
:If small DataFrame and performance is not important generate for each row
date_range
and then useDataFrame.explode
:For large DataFrames repeat indices by difference
start
andstop
columns in minutes first and then add counter byGroupBy.cumcount
with convert to timedeltas byto_timedelta
: