groupby 显示每人每天的时间 pandas

发布于 2025-01-11 21:21:30 字数 903 浏览 0 评论 0原文

我试图按 id、时间戳过滤此数据帧,第三列是条目之间的时间差异。我可以让它显示每个 id 所有日期的总和,但无法让它显示每个 id 每天的总和。

import datetime
import pandas as pd
timestamps = [
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 1
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 1, 11, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 2, 11, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 2, 11, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 4, 10, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 5, 12, 0, 0, 0)  # person 3
]
df1 = pd.DataFrame({'person': [1, 2, 1, 3, 2, 1, 3, 2], 'timestamp': timestamps}) 
df1['new'] = df1.groupby('person').timestamp.transform(pd.Series.diff).dropna()
                               
df1.groupby('person')['timestamp','new'].sum()

这只是给我总数,而不是每天。我每天如何组合它们?

I'm trying to filter this dataframe by id, timestamp and my third column is the time diff between entries. I can get it to display the total sum per id for all days but can't make it work to display sum per day per id.

import datetime
import pandas as pd
timestamps = [
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 1
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 1, 11, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 2, 11, 0, 0, 0), # person 2
    datetime.datetime(2018, 1, 1, 10, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 2, 11, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 4, 10, 0, 0, 0), # person 3
    datetime.datetime(2018, 1, 5, 12, 0, 0, 0)  # person 3
]
df1 = pd.DataFrame({'person': [1, 2, 1, 3, 2, 1, 3, 2], 'timestamp': timestamps}) 
df1['new'] = df1.groupby('person').timestamp.transform(pd.Series.diff).dropna()
                               
df1.groupby('person')['timestamp','new'].sum()

This just gives me the total, not per day. How do I combine them per day?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小霸王臭丫头 2025-01-18 21:21:30

您可以在分组条件中包含“时间戳”列的日期部分,如下所示:

>>> df1.groupby(["person", df1.timestamp.dt.date])["new"].sum()

此外,如果您愿意,您可以使用时间戳中的日期创建一个新列,然后按该列进行分组:

>>> df1["date"] = df1["timestamp"].dt.date
>>> df1.groupby(["person", "date"])["new"].sum()

或者,您可以< code>.reset_index() 最后将您的组值包含在新列中。

You can just include the date part of the "timestamp" column in your groupby condition like this:

>>> df1.groupby(["person", df1.timestamp.dt.date])["new"].sum()

Also, if you prefer, you could create a new column with the date from the timestamp and then group by that column:

>>> df1["date"] = df1["timestamp"].dt.date
>>> df1.groupby(["person", "date"])["new"].sum()

Optionally, you can .reset_index() at the end to contain your group values in new columns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文