将顺序渐进 ID 分配为 pandas 系列更改中的值

发布于 2025-01-13 07:12:38 字数 1777 浏览 0 评论 0原文

我有以下 DataFrame:

date           product_code     discount
01/01/2022          1              0.7
01/01/2022          2              0.5

02/01/2022          1              0.1
02/01/2022          1              0.1
02/01/2022          2              0.5

03/01/2022          1              0.4

04/01/2022          1              0.1
04/01/2022          2              0.1

05/01/2022          1              0.1

06/01/2022          1              0.1
06/01/2022          1              0.5
...

我想在折扣率发生变化时,为每个“product_code”和折扣率组合有效地分配一个连续的渐进 ID。

因此,获得:

date           product_code     discount   promotion_id
01/01/2022          1              0.7          1
01/01/2022          2              0.5          1

02/01/2022          1              0.1          2
02/01/2022          1              0.1          2
02/01/2022          2              0.5          1

03/01/2022          1              0.4          3

04/01/2022          1              0.1          4
04/01/2022          2              0.1          2

05/01/2022          1              0.1          4

06/01/2022          1              0.1          4
06/01/2022          1              0.5          5
...

为了更好地说明,对于单个产品案例,它将是:

date           product_code     discount   promotion_id
01/01/2022          1              0.7          1

02/01/2022          1              0.1          2
02/01/2022          1              0.1          2

03/01/2022          1              0.4          3

04/01/2022          1              0.1          4

05/01/2022          1              0.1          4

06/01/2022          1              0.1          4
06/01/2022          1              0.5          5
...

我怎样才能实现这一点?

I have the following DataFrame:

date           product_code     discount
01/01/2022          1              0.7
01/01/2022          2              0.5

02/01/2022          1              0.1
02/01/2022          1              0.1
02/01/2022          2              0.5

03/01/2022          1              0.4

04/01/2022          1              0.1
04/01/2022          2              0.1

05/01/2022          1              0.1

06/01/2022          1              0.1
06/01/2022          1              0.5
...

And I would like to efficiently assign a sequential progressive ID, whenever the discount ratio changes, for each 'product_code' and discount ratio combination.

Thus, obtaining:

date           product_code     discount   promotion_id
01/01/2022          1              0.7          1
01/01/2022          2              0.5          1

02/01/2022          1              0.1          2
02/01/2022          1              0.1          2
02/01/2022          2              0.5          1

03/01/2022          1              0.4          3

04/01/2022          1              0.1          4
04/01/2022          2              0.1          2

05/01/2022          1              0.1          4

06/01/2022          1              0.1          4
06/01/2022          1              0.5          5
...

To better illustrate, for a single product case it would be:

date           product_code     discount   promotion_id
01/01/2022          1              0.7          1

02/01/2022          1              0.1          2
02/01/2022          1              0.1          2

03/01/2022          1              0.4          3

04/01/2022          1              0.1          4

05/01/2022          1              0.1          4

06/01/2022          1              0.1          4
06/01/2022          1              0.5          5
...

How can I achieve that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

输什么也不输骨气 2025-01-20 07:12:38

您可以在groupby中使用diffcumsum进行检查

df['id'] = df.groupby('product_code',sort=False)['discount'].apply(lambda x : x.diff().ne(0).cumsum())
df
Out[644]: 
          date  product_code  discount  id
0   01/01/2022             1       0.7   1
1   01/01/2022             2       0.5   1
2   02/01/2022             1       0.1   2
3   02/01/2022             1       0.1   2
4   02/01/2022             2       0.5   1
5   03/01/2022             1       0.4   3
6   04/01/2022             1       0.1   4
7   04/01/2022             2       0.1   2
8   05/01/2022             1       0.1   4
9   06/01/2022             1       0.1   4
10  06/01/2022             1       0.5   5

You may check with diff with cumsum within groupby

df['id'] = df.groupby('product_code',sort=False)['discount'].apply(lambda x : x.diff().ne(0).cumsum())
df
Out[644]: 
          date  product_code  discount  id
0   01/01/2022             1       0.7   1
1   01/01/2022             2       0.5   1
2   02/01/2022             1       0.1   2
3   02/01/2022             1       0.1   2
4   02/01/2022             2       0.5   1
5   03/01/2022             1       0.4   3
6   04/01/2022             1       0.1   4
7   04/01/2022             2       0.1   2
8   05/01/2022             1       0.1   4
9   06/01/2022             1       0.1   4
10  06/01/2022             1       0.5   5
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文