pandas 时间戳的累积时间总和

发布于 2025-01-13 06:21:27 字数 633 浏览 3 评论 0原文

我试图从给定事件跟踪中的时间戳中查找时间总和(以秒为单位)。数据输入和输出位于 Pandas DataFrame 中。怎么可能做到这一点呢?

输入示例:

   CaseID                         Timestamps
        0   2016-01-01 09:51:15.304000+00:00    
        0   2016-01-01 09:53:15.352000+00:00    
        1   2016-01-01 09:51:15.774000+00:00    
        1   2016-01-01 09:51:47.392000+00:00    
        1   2016-01-01 09:52:15.403000+00:00        

我希望总和也能累加;忽略诸如毫秒之类的微小差异。

示例输出:

Case ID       sum_time
      0              0                
      0            120
      1              0
      1             32
      1             60

I am trying to find the sum of time (in seconds) from timestamps in a given trace of events. The data input and output is in a Pandas DataFrame. How would it be possible to do that?

Example Input :

   CaseID                         Timestamps
        0   2016-01-01 09:51:15.304000+00:00    
        0   2016-01-01 09:53:15.352000+00:00    
        1   2016-01-01 09:51:15.774000+00:00    
        1   2016-01-01 09:51:47.392000+00:00    
        1   2016-01-01 09:52:15.403000+00:00        

I would like for the sum to be added cumulatively as well; disregarding minuscule differences such as the milliseconds.

Example Output:

Case ID       sum_time
      0              0                
      0            120
      1              0
      1             32
      1             60

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无敌元气妹 2025-01-20 06:21:27

这应该可以解决问题,

import numpy as np
import pandas as pd

# recreate original data
ts = """\
2016-01-01 09:51:15.304000+00:00
2016-01-01 09:53:15.352000+00:00
2016-01-01 09:51:15.774000+00:00
2016-01-01 09:51:47.392000+00:00
2016-01-01 09:52:15.403000+00:00""".split("\n")

df = pd.DataFrame({"CaseID": [0, 0, 1, 1, 1],
                   "Timestamp": [pd.Timestamp(tmp) for tmp in ts]})


# solve the problem

def calc_csum(partial_frame):
    """
    Takes a data frame with a Timestamp column;
    Add new colum with cummulative sum.
    """
   
    # 1. create the difference array
    r = partial_frame.Timestamp.diff()
    
    # 2. fill the first value (NaT) with zero
    r[r.isna()] = pd.Timedelta(0)
    # 3. convert to seconds and use cumsum -> new column
    partial_frame["cs"] = np.cumsum(r.dt.total_seconds().values)
    return partial_frame

# apply to each "sub frame" with same CaseID
res = df.groupby("CaseID").apply(calc_csum)
print(res)

结果:

    CaseID                        Timestamp       cs
0       0   2016-01-01 09:51:15.304000+00:00    0.000
1       0   2016-01-01 09:53:15.352000+00:00  120.048
2       1   2016-01-01 09:51:15.774000+00:00    0.000
3       1   2016-01-01 09:51:47.392000+00:00   31.618
4       1   2016-01-01 09:52:15.403000+00:00   59.629

This should should solve the problem,

import numpy as np
import pandas as pd

# recreate original data
ts = """\
2016-01-01 09:51:15.304000+00:00
2016-01-01 09:53:15.352000+00:00
2016-01-01 09:51:15.774000+00:00
2016-01-01 09:51:47.392000+00:00
2016-01-01 09:52:15.403000+00:00""".split("\n")

df = pd.DataFrame({"CaseID": [0, 0, 1, 1, 1],
                   "Timestamp": [pd.Timestamp(tmp) for tmp in ts]})


# solve the problem

def calc_csum(partial_frame):
    """
    Takes a data frame with a Timestamp column;
    Add new colum with cummulative sum.
    """
   
    # 1. create the difference array
    r = partial_frame.Timestamp.diff()
    
    # 2. fill the first value (NaT) with zero
    r[r.isna()] = pd.Timedelta(0)
    # 3. convert to seconds and use cumsum -> new column
    partial_frame["cs"] = np.cumsum(r.dt.total_seconds().values)
    return partial_frame

# apply to each "sub frame" with same CaseID
res = df.groupby("CaseID").apply(calc_csum)
print(res)

Result:

    CaseID                        Timestamp       cs
0       0   2016-01-01 09:51:15.304000+00:00    0.000
1       0   2016-01-01 09:53:15.352000+00:00  120.048
2       1   2016-01-01 09:51:15.774000+00:00    0.000
3       1   2016-01-01 09:51:47.392000+00:00   31.618
4       1   2016-01-01 09:52:15.403000+00:00   59.629
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文