在熊猫中连续找到

发布于 2025-02-09 11:30:45 字数 925 浏览 1 评论 0原文

I am following this article - PANDAS输出日期,启动和结束时间和事件状态给定的DateTime连续性

连续测试小时的示例是在帖子中。我需要在连续几分钟内进行测试。我将代码线从3600修改为60(小时到几分钟),

#test consecutive minutes
df['g'] = df['Date'].diff().dt.total_seconds().div(60).ne(1)

最终结果在任何连续的时间内都返回全部正确。

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    True
2009-02-13 13:25:00   32.04    True
2009-02-13 13:26:00   45.94    True
2009-02-13 15:45:00   45.94    True

结果应该

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    False
2009-02-13 13:25:00   32.04    False
2009-02-13 13:26:00   45.94    False
2009-02-13 15:45:00   45.94    True

在哪里有什么问题?

I am following this article - Pandas output date, start and end time and event status given datetime continuity

An example of testing consecutive hours is in the post. I need to test in consecutive minutes. I modified the line of code from 3600 to 60 (hours to minutes)

#test consecutive minutes
df['g'] = df['Date'].diff().dt.total_seconds().div(60).ne(1)

The end result returns all True for any consecutive minutes.

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    True
2009-02-13 13:25:00   32.04    True
2009-02-13 13:26:00   45.94    True
2009-02-13 15:45:00   45.94    True

Where the result should be

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    False
2009-02-13 13:25:00   32.04    False
2009-02-13 13:26:00   45.94    False
2009-02-13 15:45:00   45.94    True

What is wrong here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

愿与i 2025-02-16 11:30:46

您的代码问题可能是由于浮点近似而引起的?如果您围绕值来解决这将解决:

pd.to_datetime(df['Date']).diff().dt.total_seconds().div(60).round().ne(1)

但是,有一种更好的方法,请使用TimeDELTA比较属性:

df['g'] = pd.to_datetime(df['Date']).diff().ne('1min')

输出:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94  False
4  2009-02-13 15:45:00  45.94   True

对于您的初始问题(组的第一个和最后一个):

forward = pd.to_datetime(df['Date']).diff().ne('1min')
reverse = (-pd.to_datetime(df['Date']).diff(-1)).ne('1min')
df['g'] = forward|reverse

输出:输出:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94   True

The issue with your code is likely due to floating point approximation? This would be solved if you round the values:

pd.to_datetime(df['Date']).diff().dt.total_seconds().div(60).round().ne(1)

However, there is a much better way, use the Timedelta comparison properties:

df['g'] = pd.to_datetime(df['Date']).diff().ne('1min')

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94  False
4  2009-02-13 15:45:00  45.94   True

For your initial question (first and last of group):

forward = pd.to_datetime(df['Date']).diff().ne('1min')
reverse = (-pd.to_datetime(df['Date']).diff(-1)).ne('1min')
df['g'] = forward|reverse

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94   True
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文