比较不同数据帧之间的日期时间

发布于 2025-01-10 06:18:13 字数 805 浏览 0 评论 0原文

我有两个不同的数据框。

df1= 
index Time
0   2009-09-13 01:17:00
1   2009-09-13 02:17:00
2   2009-09-13 03:17:00
3   2009-09-13 04:17:00
4   2009-09-13 05:17:00
............

这是一个时间序列数据,每小时间隔总共 10 天。 我

df2=
    Report Time                 x
0   2009-09-13 01:17:00         1
1   2009-09-13 02:20:00         27
2   2009-09-13 02:25:00         1
3   2009-09-13 05:33:00         100
..............

想通过“df2”迭代“df1”每小时时间数据,并将“x”求和为其每小时值。然后每小时的数据将存储x值。

输出应如下所示:

index Time                    x
0   2009-09-13 01:17:00       1
1   2009-09-13 02:17:00      28
2   2009-09-13 03:17:00       0
3   2009-09-13 04:17:00       0
4   2009-09-13 05:17:00       100

请注意,数据集“df1”的所有 x 值应在同一小时内添加在一起。

我尝试实现 for 循环但无法完成。

I have two different data frames.

df1= 
index Time
0   2009-09-13 01:17:00
1   2009-09-13 02:17:00
2   2009-09-13 03:17:00
3   2009-09-13 04:17:00
4   2009-09-13 05:17:00
............

This is a time-series data with hourly interval spanning in total 10 days.
and

df2=
    Report Time                 x
0   2009-09-13 01:17:00         1
1   2009-09-13 02:20:00         27
2   2009-09-13 02:25:00         1
3   2009-09-13 05:33:00         100
..............

I want to iterate the "df1" hourly time data through "df2" and do the summation of "x" to its hourly value. Then The hourly data will store the x value.

The output should look like this:

index Time                    x
0   2009-09-13 01:17:00       1
1   2009-09-13 02:17:00      28
2   2009-09-13 03:17:00       0
3   2009-09-13 04:17:00       0
4   2009-09-13 05:17:00       100

Note that all of the x value should be added together within the same hour for the dataset "df1".

I have tried to implement for loop but couldn't complete it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

七秒鱼° 2025-01-17 06:18:13

使用 DatetimeIndex.floor 获取两个数据帧之间的相同时间:

# Ensure the columns have datetime64 dtype
df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])

out = (
    df1.merge(df2.groupby(df2['Report Time'].dt.floor('H')).sum(), 
            left_on=df1['Time'].dt.floor('H'), right_index=True, how='left')
      .fillna(0)
)

输出:

>>> out
                 Time      x
0 2009-09-13 01:17:00    1.0
1 2009-09-13 02:17:00   28.0
2 2009-09-13 03:17:00    0.0
3 2009-09-13 04:17:00    0.0
4 2009-09-13 05:17:00  100.0

详细信息:

>>> df2.groupby(df2['Report Time'].dt.floor('H')).sum()
                       x
Report Time             
2009-09-13 01:00:00    1
2009-09-13 02:00:00   28
2009-09-13 05:00:00  100

>>> df1['Time'].dt.floor('H')
0   2009-09-13 01:00:00
1   2009-09-13 02:00:00
2   2009-09-13 03:00:00
3   2009-09-13 04:00:00
4   2009-09-13 05:00:00
Name: Time, dtype: datetime64[ns]

Use DatetimeIndex.floor to get same time between the 2 dataframes:

# Ensure the columns have datetime64 dtype
df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])

out = (
    df1.merge(df2.groupby(df2['Report Time'].dt.floor('H')).sum(), 
            left_on=df1['Time'].dt.floor('H'), right_index=True, how='left')
      .fillna(0)
)

Output:

>>> out
                 Time      x
0 2009-09-13 01:17:00    1.0
1 2009-09-13 02:17:00   28.0
2 2009-09-13 03:17:00    0.0
3 2009-09-13 04:17:00    0.0
4 2009-09-13 05:17:00  100.0

Details:

>>> df2.groupby(df2['Report Time'].dt.floor('H')).sum()
                       x
Report Time             
2009-09-13 01:00:00    1
2009-09-13 02:00:00   28
2009-09-13 05:00:00  100

>>> df1['Time'].dt.floor('H')
0   2009-09-13 01:00:00
1   2009-09-13 02:00:00
2   2009-09-13 03:00:00
3   2009-09-13 04:00:00
4   2009-09-13 05:00:00
Name: Time, dtype: datetime64[ns]
梦纸 2025-01-17 06:18:13

您需要执行merge_asof

df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])

(pd
 .merge_asof(df2, df1, left_on='Report Time', right_on='Time')
 .groupby('Time')
 .agg({'x': 'sum'})
 .reindex(df1['Time'], fill_value=0)
 .reset_index()
 )

输出:

                 Time    x
0 2009-09-13 01:17:00    1
1 2009-09-13 02:17:00   28
2 2009-09-13 03:17:00    0
3 2009-09-13 04:17:00    0
4 2009-09-13 05:17:00  100

You need to perform a merge_asof:

df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])

(pd
 .merge_asof(df2, df1, left_on='Report Time', right_on='Time')
 .groupby('Time')
 .agg({'x': 'sum'})
 .reindex(df1['Time'], fill_value=0)
 .reset_index()
 )

Output:

                 Time    x
0 2009-09-13 01:17:00    1
1 2009-09-13 02:17:00   28
2 2009-09-13 03:17:00    0
3 2009-09-13 04:17:00    0
4 2009-09-13 05:17:00  100
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文