Pandas 按选定日期分组

发布于 2025-01-10 21:31:33 字数 2333 浏览 0 评论 0原文

我有一个与此数据框非常相似的数据框:

索引日期月份
02019-12-112
12020-03-13
22020-07-17
32021-02-12
42021-09-19

我想合并最接近一组月份的所有日期。月份需要像这样标准化:

月份标准化月份
3, 4, 54
6, 7, 8, 98
1, 2, 10, 11, 1212

因此输出将是:

索引日期月份
02019-12- 112
12020-04-14
22020-08-18
32020-12-112
42021-08-18

I have a dataframe that is very similar to this dataframe:

indexdatemonth
02019-12-112
12020-03-13
22020-07-17
32021-02-12
42021-09-19

And i want to combine all dates that are closest to a set of months. The months need to be normalized like this:

MonthsNormalized month
3, 4, 54
6, 7, 8, 98
1, 2, 10, 11, 1212

So the output will be:

indexdatemonth
02019-12-112
12020-04-14
22020-08-18
32020-12-112
42021-08-18

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

著墨染雨君画夕 2025-01-17 21:31:33

您可以迭代 DataFrame 并使用替换来更改日期。

import pandas as pd 

df = pd.DataFrame(data={'date': ["2019-12-1", "2020-03-1", "2020-07-1", "2021-02-1", "2021-09-1"], 
                        'month': [12,3,7,2,9]})
for index, row in df.iterrows():
    if (row['month'] in [3,4,5]):
        df['month'][index] = 4
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"04")
    elif (row['month'] in [6,7,8,9]):
        df['month'][index] = 8
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"08")
    else:
        df['month'][index] = 12
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"12")
    

You can iterate through the DataFrame and use replace to change the dates.

import pandas as pd 

df = pd.DataFrame(data={'date': ["2019-12-1", "2020-03-1", "2020-07-1", "2021-02-1", "2021-09-1"], 
                        'month': [12,3,7,2,9]})
for index, row in df.iterrows():
    if (row['month'] in [3,4,5]):
        df['month'][index] = 4
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"04")
    elif (row['month'] in [6,7,8,9]):
        df['month'][index] = 8
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"08")
    else:
        df['month'][index] = 12
        df["date"][index]  = df["date"][0].replace(df["date"][0][5:7],"12")
    
夏有森光若流苏 2025-01-17 21:31:33

您可以尝试创建月份字典,其中:

norm_month_dict = {3: 4, 4: 4, 5: 4, 6: 8, 7: 8, 8: 8, 9: 8, 1: 12, 2: 12, 10: 12, 11: 12, 12: 12}

然后使用此字典将月份值映射到各自的标准化月份价值观。

df['normalized_months'] = df.months.map(norm_month_dict)

you can try creating a dictionary of months where:

norm_month_dict = {3: 4, 4: 4, 5: 4, 6: 8, 7: 8, 8: 8, 9: 8, 1: 12, 2: 12, 10: 12, 11: 12, 12: 12}

then use this dictionary to map month values to their respective normalized month values.

df['normalized_months'] = df.months.map(norm_month_dict)

骷髅 2025-01-17 21:31:33

您需要从第二个数据帧构造一个字典(假设df1df2):

d = (
 df2.assign(Months=df2['Months'].str.split(', '))
    .explode('Months').astype(int)
    .set_index('Months')['Normalized month'].to_dict()
)
# {3: 4, 4: 4, 5: 4, 6: 8, 7: 8, 8: 8, 9: 8, 1: 12, 2: 12, 10: 12, 11: 12, 12: 12}

然后映射值:

df1['month'] = df1['month'].map(d)

输出:

   index        date   month
0       0  2019-12-1      12
1       1  2020-03-1       4
2       2  2020-07-1       8
3       3  2021-02-1      12
4       4  2021-09-1       8`

You need to construct a dictionary from the second dataframe (assuming df1 and df2):

d = (
 df2.assign(Months=df2['Months'].str.split(', '))
    .explode('Months').astype(int)
    .set_index('Months')['Normalized month'].to_dict()
)
# {3: 4, 4: 4, 5: 4, 6: 8, 7: 8, 8: 8, 9: 8, 1: 12, 2: 12, 10: 12, 11: 12, 12: 12}

Then map the values:

df1['month'] = df1['month'].map(d)

output:

   index        date   month
0       0  2019-12-1      12
1       1  2020-03-1       4
2       2  2020-07-1       8
3       3  2021-02-1      12
4       4  2021-09-1       8`
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文