如何根据 Pandas 中的年份和周数定义 5-4-4 周的期间列

发布于 2025-01-16 23:38:29 字数 2297 浏览 0 评论 0原文

我的公司使用 5-4-4 日历进行报告。每个月(又称周期)长 4 周,但每个第一个月长 5 周。

Pandas 似乎对 自定义日历周期并发现此文档也是如此。

我正在使用下面的代码从周和年创建期间列

dff['Period'] = pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w").dt.month
dff['Period'] ='Period ' + dff['Period'].astype(str).str.zfill(2)

,但是我的公司对期间使用 5-4-4 格式,即

  • 前 5 周 [Wk01 到 Wk05] 应被视为期间 1,
  • 接下来的 4 周 [Wk06 到 Wk09] 应被视为期间 1被视为周期 2
  • 被视为周期 3

接下来 4 周 [第 10 周到第 13 周] 应该再次

  • 接下来 5 周 [第 14 周到第 18 周] 应该被视为周期 接下来的 4周
  • [第 19 周到第 22 周] 应被视为第 5 期
  • 接下来的 4 周 [第 23 周到第 26 周] 应被视为第 6 期

,每年依此类推......

当前产出

Year    Period      Week    Week ID Site
2020    Period 01   Wk01    202001  ABC
2020    Period 01   Wk02    202002  ABC
2020    Period 01   Wk03    202003  ABC
2020    Period 01   Wk04    202004  ABC
2020    Period 02   Wk05    202005  ABC
2020    Period 02   Wk06    202006  ABC
2020    Period 02   Wk07    202007  ABC
2020    Period 02   Wk08    202008  ABC
2020    Period 03   Wk09    202009  ABC
2020    Period 03   Wk10    202010  ABC
2020    Period 03   Wk11    202011  ABC
2020    Period 03   Wk12    202012  ABC
2020    Period 03   Wk13    202013  ABC
2020    Period 04   Wk14    202014  ABC
2020    Period 04   Wk15    202015  ABC

预期产出

Year    Period      Week    Week ID Site
2020    Period 01   Wk01    202001  ABC
2020    Period 01   Wk02    202002  ABC
2020    Period 01   Wk03    202003  ABC
2020    Period 01   Wk04    202004  ABC
2020    Period 01   Wk05    202005  ABC
2020    Period 02   Wk06    202006  ABC
2020    Period 02   Wk07    202007  ABC
2020    Period 02   Wk08    202008  ABC
2020    Period 02   Wk09    202009  ABC
2020    Period 03   Wk10    202010  ABC
2020    Period 03   Wk11    202011  ABC
2020    Period 03   Wk12    202012  ABC
2020    Period 03   Wk13    202013  ABC
2020    Period 04   Wk14    202014  ABC
2020    Period 04   Wk15    202015  ABC

My company uses a 5-4-4 calendar for reporting purposes. Each month (aka period) is 4-weeks long, except every first month is 5-weeks long.

Pandas seems to have good support for custom calendar periods and found this documentation too.

I'm creating Period column from Week and Year using below code

dff['Period'] = pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w").dt.month
dff['Period'] ='Period ' + dff['Period'].astype(str).str.zfill(2)

However my company is using 5-4-4 format for period i.e.

  • First 5 weeks [Wk01 to Wk05] should be considered as Period 1
  • next 4 weeks [Wk06 to Wk09] should be considered as Period 2
  • next 4 weeks [Wk10 to Wk13] should be considered as Period 3

again

  • next 5 weeks [Wk14 to Wk18] should be considered as Period 4
  • next 4 weeks [Wk19 to Wk22] should be considered as Period 5
  • next 4 weeks [Wk23 to Wk26] should be considered as Period 6

and so on for every year.......

Current output

Year    Period      Week    Week ID Site
2020    Period 01   Wk01    202001  ABC
2020    Period 01   Wk02    202002  ABC
2020    Period 01   Wk03    202003  ABC
2020    Period 01   Wk04    202004  ABC
2020    Period 02   Wk05    202005  ABC
2020    Period 02   Wk06    202006  ABC
2020    Period 02   Wk07    202007  ABC
2020    Period 02   Wk08    202008  ABC
2020    Period 03   Wk09    202009  ABC
2020    Period 03   Wk10    202010  ABC
2020    Period 03   Wk11    202011  ABC
2020    Period 03   Wk12    202012  ABC
2020    Period 03   Wk13    202013  ABC
2020    Period 04   Wk14    202014  ABC
2020    Period 04   Wk15    202015  ABC

Expected output

Year    Period      Week    Week ID Site
2020    Period 01   Wk01    202001  ABC
2020    Period 01   Wk02    202002  ABC
2020    Period 01   Wk03    202003  ABC
2020    Period 01   Wk04    202004  ABC
2020    Period 01   Wk05    202005  ABC
2020    Period 02   Wk06    202006  ABC
2020    Period 02   Wk07    202007  ABC
2020    Period 02   Wk08    202008  ABC
2020    Period 02   Wk09    202009  ABC
2020    Period 03   Wk10    202010  ABC
2020    Period 03   Wk11    202011  ABC
2020    Period 03   Wk12    202012  ABC
2020    Period 03   Wk13    202013  ABC
2020    Period 04   Wk14    202014  ABC
2020    Period 04   Wk15    202015  ABC

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

红墙和绿瓦 2025-01-23 23:38:29

您可以使用一些数学来计算映射器:

# define week pattern
pat = np.array([5,4,4,4])
# weeks
weeks = np.arange(52)+1
# get divider/remainder of full period (=divmod 17 here)
d,r = np.divmod((weeks-1), pat.sum())
# determine periods
a = (r==pat.cumsum()[:,None]).cumsum(axis=1).sum(axis=0)+d+1

# define mapper
week_to_period = dict(enumerate(a, start=1))
# {1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2, 9: 2, 10: 3, 11: 3,...}

# map week numbers to custom periods
s = (pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w")
       .dt.isocalendar().week()
     )
dff['Period'] = 'Period '+s.map(week_to_period).astype(str).str.zfill(2)

输出:

    Year     Period  Week  Week ID Site
0   2020  Period 01  Wk01   202001  ABC
1   2020  Period 01  Wk02   202002  ABC
2   2020  Period 01  Wk03   202003  ABC
3   2020  Period 01  Wk04   202004  ABC
4   2020  Period 01  Wk05   202005  ABC
5   2020  Period 02  Wk06   202006  ABC
6   2020  Period 02  Wk07   202007  ABC
7   2020  Period 02  Wk08   202008  ABC
8   2020  Period 02  Wk09   202009  ABC
9   2020  Period 03  Wk10   202010  ABC
10  2020  Period 03  Wk11   202011  ABC
11  2020  Period 03  Wk12   202012  ABC
12  2020  Period 03  Wk13   202013  ABC
13  2020  Period 04  Wk14   202014  ABC
14  2020  Period 04  Wk15   202015  ABC

You can use some math to compute a mapper:

# define week pattern
pat = np.array([5,4,4,4])
# weeks
weeks = np.arange(52)+1
# get divider/remainder of full period (=divmod 17 here)
d,r = np.divmod((weeks-1), pat.sum())
# determine periods
a = (r==pat.cumsum()[:,None]).cumsum(axis=1).sum(axis=0)+d+1

# define mapper
week_to_period = dict(enumerate(a, start=1))
# {1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 2, 7: 2, 8: 2, 9: 2, 10: 3, 11: 3,...}

# map week numbers to custom periods
s = (pd.to_datetime(dff["Week ID"].astype("str") + "-0", format="%Y%U-%w")
       .dt.isocalendar().week()
     )
dff['Period'] = 'Period '+s.map(week_to_period).astype(str).str.zfill(2)

output:

    Year     Period  Week  Week ID Site
0   2020  Period 01  Wk01   202001  ABC
1   2020  Period 01  Wk02   202002  ABC
2   2020  Period 01  Wk03   202003  ABC
3   2020  Period 01  Wk04   202004  ABC
4   2020  Period 01  Wk05   202005  ABC
5   2020  Period 02  Wk06   202006  ABC
6   2020  Period 02  Wk07   202007  ABC
7   2020  Period 02  Wk08   202008  ABC
8   2020  Period 02  Wk09   202009  ABC
9   2020  Period 03  Wk10   202010  ABC
10  2020  Period 03  Wk11   202011  ABC
11  2020  Period 03  Wk12   202012  ABC
12  2020  Period 03  Wk13   202013  ABC
13  2020  Period 04  Wk14   202014  ABC
14  2020  Period 04  Wk15   202015  ABC
梦里梦着梦中梦 2025-01-23 23:38:29
import numpy as np
import pandas as pd

# Define range of months
x = list(range(1,13))

# Define week pattern
y = [5, 4, 4]*4

period1 = list(np.repeat(x, y))
wk = ['Wk'+str(i) for i in range(1,53)]

di_wk = dict(zip(wk, period1))
'''{'Wk1': 1, 'Wk2': 1, 'Wk3': 1, 'Wk4': 1, 'Wk5': 1, 'Wk6': 2, 'Wk7': 2, 'Wk8': 2,
 'Wk9': 2, 'Wk10': 3, 'Wk11': 3, 'Wk12': 3, 'Wk13': 3, 'Wk14': 4, 'Wk15': 4,......}'''

dff['Period'] = 'Period ' + dff['Week'].map(di_wk).astype(str).str.zfill(2)
import numpy as np
import pandas as pd

# Define range of months
x = list(range(1,13))

# Define week pattern
y = [5, 4, 4]*4

period1 = list(np.repeat(x, y))
wk = ['Wk'+str(i) for i in range(1,53)]

di_wk = dict(zip(wk, period1))
'''{'Wk1': 1, 'Wk2': 1, 'Wk3': 1, 'Wk4': 1, 'Wk5': 1, 'Wk6': 2, 'Wk7': 2, 'Wk8': 2,
 'Wk9': 2, 'Wk10': 3, 'Wk11': 3, 'Wk12': 3, 'Wk13': 3, 'Wk14': 4, 'Wk15': 4,......}'''

dff['Period'] = 'Period ' + dff['Week'].map(di_wk).astype(str).str.zfill(2)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文