在熊猫中连续找到

发布于 2025-02-09 11:30:45 字数 925 浏览 1 评论 0原文

I am following this article - PANDAS输出日期，启动和结束时间和事件状态给定的DateTime连续性

连续测试小时的示例是在帖子中。我需要在连续几分钟内进行测试。我将代码线从3600修改为60（小时到几分钟），

#test consecutive minutes
df['g'] = df['Date'].diff().dt.total_seconds().div(60).ne(1)

最终结果在任何连续的时间内都返回全部正确。

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    True
2009-02-13 13:25:00   32.04    True
2009-02-13 13:26:00   45.94    True
2009-02-13 15:45:00   45.94    True

结果应该

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    False
2009-02-13 13:25:00   32.04    False
2009-02-13 13:26:00   45.94    False
2009-02-13 15:45:00   45.94    True

在哪里有什么问题？

原文

I am following this article - Pandas output date, start and end time and event status given datetime continuity

An example of testing consecutive hours is in the post. I need to test in consecutive minutes. I modified the line of code from 3600 to 60 (hours to minutes)

#test consecutive minutes
df['g'] = df['Date'].diff().dt.total_seconds().div(60).ne(1)

The end result returns all True for any consecutive minutes.

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    True
2009-02-13 13:25:00   32.04    True
2009-02-13 13:26:00   45.94    True
2009-02-13 15:45:00   45.94    True

Where the result should be

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    False
2009-02-13 13:25:00   32.04    False
2009-02-13 13:26:00   45.94    False
2009-02-13 15:45:00   45.94    True

What is wrong here?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

愿与i 2025-02-16 11:30:46

您的代码问题可能是由于浮点近似而引起的？如果您围绕值来解决这将解决：

pd.to_datetime(df['Date']).diff().dt.total_seconds().div(60).round().ne(1)

但是，有一种更好的方法，请使用TimeDELTA比较属性：

df['g'] = pd.to_datetime(df['Date']).diff().ne('1min')

输出：

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94  False
4  2009-02-13 15:45:00  45.94   True

对于您的初始问题（组的第一个和最后一个）：

forward = pd.to_datetime(df['Date']).diff().ne('1min')
reverse = (-pd.to_datetime(df['Date']).diff(-1)).ne('1min')
df['g'] = forward|reverse

输出：输出：

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94   True

The issue with your code is likely due to floating point approximation? This would be solved if you round the values:

pd.to_datetime(df['Date']).diff().dt.total_seconds().div(60).round().ne(1)

However, there is a much better way, use the Timedelta comparison properties:

df['g'] = pd.to_datetime(df['Date']).diff().ne('1min')

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94  False
4  2009-02-13 15:45:00  45.94   True

For your initial question (first and last of group):

forward = pd.to_datetime(df['Date']).diff().ne('1min')
reverse = (-pd.to_datetime(df['Date']).diff(-1)).ne('1min')
df['g'] = forward|reverse

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94   True

回复收藏 0 原文

~没有更多了~