如何使用有条件的窗口标准在Python中的不同索引级别上进行滚动总和
我想根据索引的不同级别进行滚动和滚动,但正在努力使其成为现实。而不是解释问题,而是在演示输入和所需的输出下以及正在寻找的见解。
因此,我有多个品牌,他们在以下分组的不同年度日的各种商品类别的销售中。我想要的是每天的动态滚动总和,按要求在一年中滚动窗户。
例如,如果有人问
emo问题1)直到某天(不包括那一天),他们对该特定品牌的特定类别的销售是什么。
我需要能够每天回答这个问题,即每一行应该具有一个数字,如表2.0所示。
我希望能够以一种方式进行编码,如果问题从2年变为3年,我只需要更改一个数字。 我还需要在当月的级别上做同样的事情。
demo问题2)直到某天(不包括那一天)他们对该特定类别的最后3个月出售是什么该特定品牌的那一年。
是演示输入
这些桌子由品牌,类别,年,月份,月,日和销售总和分组 所有信息和销售
表1.0
品牌 | 类别 | 年度 | 每天每天 | 的 | 销售 |
---|---|---|---|---|---|
ABC | Big Teppliances | 2021 | 9 | 3 | 0 |
服装 | 2021 | 9 | 2 | 0 | |
电子设备 | 2020 | 10 | 18 | 2 | |
用户 | 2020 | 10 | 18 | 0 | |
2021 | 9 | 2 | 4 | ||
3 | 0 | ||||
XYZ | Big Thepliances | 2012 | 4 | 29 | 7 |
2013 | 4 | 7 | 6 | ||
服装 | 2012 | 4 | 29 | 3 | |
电子 | 2013 | 4 | 9 | 1 | |
27 | 2 | ||||
5 | 4 | 5 | |||
2015 2015 | 4 | 27 | 7 | ||
5 | 2 2 | 2 | |||
粉丝 | 2013 | 4 | 14 | 4 | |
4 5 | 4 | 0 | |||
2015 | 4 0 2015 4 | 18 | 1 5 | ||
17 11 | 11 | 11 | |||
2016 | 4 2016 4 | 12 | 18 | ||
家具 | 2012 | 5 | 4 | 1 | |
8 | 6 | ||||
20 | 4 | ||||
2013 | 4 | 5 | 1 | ||
7 | 8 | ||||
9 | 2 | ||||
2015 | 4 | 18 | 12 | ||
27 | 15 | ||||
5 | 2 | 4 | |||
17 | 3 | ||||
音乐5 | 2012年 | 4 | 10 | 2013 | |
2015 | 5 | 6 | 16 | ||
4 | 18 | 10 | 18 | ||
2015 | 0 | ||||
2016 | 4 | 12 | 1 | ||
16 | 13 | ||||
UTENC | 2012 | 5 | 8 | 2 | |
2016 | 4 | 16 | 3 | ||
18 | 2 | ||||
2017 | 4 | 12 | 13 |
以下是基于演示表的演示问题1的输出(最后2年不包括那天不包括那天)
表2.0
品牌 | 类别的 | 销售 | 销售 | 有条件 | 0 | 货物(直到最后2年) |
---|---|---|---|---|---|---|
ABC | 大型电器 | 2021 | 9 | 3 | 0 | 0 |
服装 | 2021 | 9 | 2 | 0 | 0 | |
电子设备 | 2020 | 10 | 18 | 2 | 年度 | |
用户 | 2020 | 10 | 18 | 0 0 | 0 | |
2021 | 9 | 2 | 4 | 0 | ||
4 | 4 | 0 3 0 3 | ||||
XYZ | BIG BIG电器 | 2012 | 4 | 29 | 7 | 0 |
2013 | 4 | 7 | 6 | 7 | ||
服装 | 2012 | 4 | 29 | 3 | 0 | |
电子 | 2013 | 4 | 9 | 1 | 0 | |
27 | 2 | 1 | ||||
5 4 5 | 4 | 5 | 3 | |||
2015 | 4 | 27 | 7 | 8 | ||
5 | 2 | 2 | 1 | |||
粉丝 | 2013 | 4 | 14 | 4 14 4 | 0 5 4 0 | |
0 | 4 | 4 | 2015 | |||
2015 | 4 | 18 | 1 | 4 | ||
5 | 17 | 11 | 5 | |||
2016 | 4 | 12 | 18 | 12 | ||
家具 | 2012 | 5 | 4 | 1 | 0 | |
8 | 6 | 1 | ||||
20 | 4 | 7 | ||||
2013 | 4 | 5 | 1 | 11 | ||
7 | 8 | 12 | ||||
9 | 2 | 20 | ||||
2015 | 4 | 18 | 12 | 11 | ||
27 | 17 15 | 23 | ||||
5 | 2 | 4 | 38 | |||
17 | 3 | 42 | ||||
音乐界 | 2012年 | 4 | 5 | 5月18日10 | 0 | |
2013 | 0 | 6 | 10 | 12 | ||
2015 | 4 | 16 | 10 | 6 | ||
18 | 0 | 16 | ||||
2016 2016 | 4 | 1 | 10 | 16 | ||
16 | 16 | 11 | ||||
USENCY | 2012 | 5 | 8 | 2 | 4 | |
2016 | 4 | 16 | 3 | 0 | ||
18 | 2 | 3 | ||||
12 13 2017 | 2017 4 | 12 | 13 | 5 |
END想法:
这个想法是基本上是在一年的专栏上进行滚动窗口,以维护2年跨度标准,并继续总结销售数字。
PS我确实需要一个快速解决方案,因此由于数据大小庞大,因此在启用了一个。函数的行,我认为这是不可行的。通过使用某种组滚动总和或支持列的更好解决方案确实很有帮助。
I want to do a rolling sum based on different levels of the index but am struggling to make it a reality. Instead of explaining the problem am giving below the demo input and desired output along with the kind of insights am looking for.
So I have multiple brands and each of their sales of various item categories in different year month day grouped by as below. What I want is a dynamic rolling sum at each day level, rolled over a window on Year as asked.
for eg, if someone asks
Demo question 1) Till a certain day(not including that day) what were their last 2 years' sales of that particular category for that particular brand.
I need to be able to answer this for every single day i.e every single row should have a number as shown in Table 2.0.
I want to be able to code in such a way that if the question changes from 2 years to 3 years I just need to change a number. I also need to do the same thing at the month's level.
demo question 2) Till a certain day(not including that day) what was their last 3 months' sale of that particular category for that particular year for that particular brand.
Below is demo input
The tables are grouped by brand,category,year,month,day and sum of sales from a master table which had all the info and sales at hour level each day
Table 1.0
Brand | Category | Year | Month | Day | Sales |
---|---|---|---|---|---|
ABC | Big Appliances | 2021 | 9 | 3 | 0 |
Clothing | 2021 | 9 | 2 | 0 | |
Electronics | 2020 | 10 | 18 | 2 | |
Utensils | 2020 | 10 | 18 | 0 | |
2021 | 9 | 2 | 4 | ||
3 | 0 | ||||
XYZ | Big Appliances | 2012 | 4 | 29 | 7 |
2013 | 4 | 7 | 6 | ||
Clothing | 2012 | 4 | 29 | 3 | |
Electronics | 2013 | 4 | 9 | 1 | |
27 | 2 | ||||
5 | 4 | 5 | |||
2015 | 4 | 27 | 7 | ||
5 | 2 | 2 | |||
Fans | 2013 | 4 | 14 | 4 | |
5 | 4 | 0 | |||
2015 | 4 | 18 | 1 | ||
5 | 17 | 11 | |||
2016 | 4 | 12 | 18 | ||
Furniture | 2012 | 5 | 4 | 1 | |
8 | 6 | ||||
20 | 4 | ||||
2013 | 4 | 5 | 1 | ||
7 | 8 | ||||
9 | 2 | ||||
2015 | 4 | 18 | 12 | ||
27 | 15 | ||||
5 | 2 | 4 | |||
17 | 3 | ||||
Musical-inst | 2012 | 5 | 18 | 10 | |
2013 | 4 | 5 | 6 | ||
2015 | 4 | 16 | 10 | ||
18 | 0 | ||||
2016 | 4 | 12 | 1 | ||
16 | 13 | ||||
Utencils | 2012 | 5 | 8 | 2 | |
2016 | 4 | 16 | 3 | ||
18 | 2 | ||||
2017 | 4 | 12 | 13 |
Below is desired output for demo question 1 based on the demo table(last 2 years cumsum not including that day)
Table 2.0
Brand | Category | Year | Month | Day | Sales | Conditional Cumsum(till last 2 years) |
---|---|---|---|---|---|---|
ABC | Big Appliances | 2021 | 9 | 3 | 0 | 0 |
Clothing | 2021 | 9 | 2 | 0 | 0 | |
Electronics | 2020 | 10 | 18 | 2 | 0 | |
Utensils | 2020 | 10 | 18 | 0 | 0 | |
2021 | 9 | 2 | 4 | 0 | ||
3 | 0 | 4 | ||||
XYZ | Big Appliances | 2012 | 4 | 29 | 7 | 0 |
2013 | 4 | 7 | 6 | 7 | ||
Clothing | 2012 | 4 | 29 | 3 | 0 | |
Electronics | 2013 | 4 | 9 | 1 | 0 | |
27 | 2 | 1 | ||||
5 | 4 | 5 | 3 | |||
2015 | 4 | 27 | 7 | 8 | ||
5 | 2 | 2 | 15 | |||
Fans | 2013 | 4 | 14 | 4 | 0 | |
5 | 4 | 0 | 4 | |||
2015 | 4 | 18 | 1 | 4 | ||
5 | 17 | 11 | 5 | |||
2016 | 4 | 12 | 18 | 12 | ||
Furniture | 2012 | 5 | 4 | 1 | 0 | |
8 | 6 | 1 | ||||
20 | 4 | 7 | ||||
2013 | 4 | 5 | 1 | 11 | ||
7 | 8 | 12 | ||||
9 | 2 | 20 | ||||
2015 | 4 | 18 | 12 | 11 | ||
27 | 15 | 23 | ||||
5 | 2 | 4 | 38 | |||
17 | 3 | 42 | ||||
Musical-inst | 2012 | 5 | 18 | 10 | 0 | |
2013 | 4 | 5 | 6 | 10 | ||
2015 | 4 | 16 | 10 | 6 | ||
18 | 0 | 16 | ||||
2016 | 4 | 12 | 1 | 10 | ||
16 | 13 | 11 | ||||
Utencils | 2012 | 5 | 8 | 2 | 0 | |
2016 | 4 | 16 | 3 | 0 | ||
18 | 2 | 3 | ||||
2017 | 4 | 12 | 13 | 5 |
End thoughts:
The idea is to basically do a rolling window over year column maintaining the 2 years span criteria and keep on summing the sales figures.
P.S I really need a fast solution due to the huge data size and therefore created a .apply function row-wise which I didn't find feasible. A better solution by using some kind of group rolling sum or supporting columns will be really helpful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在这里,我为上述问题提供了示例解决方案。
我只是刻有ONR产品,以便解决方案是简单的
代码:
这里TOT将为给定数据输出所需的货物列。
Output:
Here you can specify the Time span using Number of days in Limit variable.
Hope this solves the problem you are looking for.
Here I'm giving a sample solution for the above problem.
I have concidered just onr product so that the solution would be simple
Code:
Here Tot would output the required cumsum column for the given data.
Output:
Here you can specify the Time span using Number of days in Limit variable.
Hope this solves the problem you are looking for.