将日期列分组为 n 天的时间段

发布于 2025-01-10 05:02:57 字数 754 浏览 0 评论 0原文

我需要一个函数,根据某些开始日期和结束日期(1 年间隔)将日期列分组为 n 天周期。为了给数据框中的每个日期分配一个季度(~90 天的时间段),我使用了下面的代码,这不是很简洁(而且我也想在 30 天的时间段内重复使用它)

def get_quarter(row, start_date, col_name):
#     date = row['TRN_DT']
    date = row[col_name]
    if date >= start_date and date <= start_date + timedelta(days = 90):
        return 0
    if date > start_date + timedelta(days = 90) and date <= start_date + timedelta(180):
        return 1
    if date > start_date + timedelta(180) and date <= start_date + timedelta(270):
        return 2
    return 3

它基本上逐行检查当前日期所属的间隔。我想知道是否有更好的方法来做到这一点。 pandas.Series.dt.to_period() 不会这样做,因为它使用日历年作为参考 --start 01.Jan, end 31.Dec;也就是说,16.Jan.XXXX 将始终位于 Q1 中;如果开始日期是 6 月 16 日,我想要的是 16.Jan 位于第三季度。谢谢

I need a function that groups date column into n-day periods with respect to some start and end dates (1 year interval). To assign a quarter (~90 day period) to every date in the data frame I used the code below, which is not very neat (and I want to reuse it for 30-day period as well)

def get_quarter(row, start_date, col_name):
#     date = row['TRN_DT']
    date = row[col_name]
    if date >= start_date and date <= start_date + timedelta(days = 90):
        return 0
    if date > start_date + timedelta(days = 90) and date <= start_date + timedelta(180):
        return 1
    if date > start_date + timedelta(180) and date <= start_date + timedelta(270):
        return 2
    return 3

It basically checks row by row which interval current date belongs to. I was wondering whether there is a better way to do this. pandas.Series.dt.to_period() will not do since it uses a calendar year as a reference --start 01.Jan, end 31.Dec; that is, 16.Jan.XXXX will always be in Q1; what I want is for 16.Jan to be in Q3 if the start date is 16-Jun. Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

弃爱 2025-01-17 05:02:57

FTR,一个可能的解决方案是根据 start_date 移动系列中的每个日期,以模拟 start_date 是年初:

>>> start_date = pd.to_datetime("2021-06-16")
>>> dates_series = pd.Series([pd.to_datetime("2020-01-16"), pd.to_datetime("2020-04-16")], name="dates")
0    1
1    2
Name: dates, dtype: int64

我们计算当前日期之间的差值日期和同年年初。

>>> offset = start_date - start_date.replace(month=1, day=1)
>>> offset
166 days 00:00:00

我们将所有日期移至同一个报价以计算“新季度”

>>> (dates - offset).dt.quarter
0    3
1    4
Name: dates, dtype: int64

FTR, a possible solution is to shift every date in the series according to the start_date, to simulate that start_date is the beginning of the year:

>>> start_date = pd.to_datetime("2021-06-16")
>>> dates_series = pd.Series([pd.to_datetime("2020-01-16"), pd.to_datetime("2020-04-16")], name="dates")
0    1
1    2
Name: dates, dtype: int64

We calculate the difference between the current date and the beginning of the same year.

>>> offset = start_date - start_date.replace(month=1, day=1)
>>> offset
166 days 00:00:00

We move all of our dates by the same offser to calculate the "new quarter"

>>> (dates - offset).dt.quarter
0    3
1    4
Name: dates, dtype: int64
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文