根据范围将日期时间分类为新列

发布于 2025-01-24 12:46:43 字数 1310 浏览 3 评论 0原文

I have a dataset called df that looks like this:

providerfidpiddatetime
CHE-2232bfc9a622f43d5572021-09-26T23:18:00
CHE-223fff669e9295b82e22021-08-13T09:10:00

我想创建一个名为wave的新表,该表具有从dateTime的日期范围的分类值。例如,从2019年11月16日到2020年2月28日的日期,它在Covid之前就具有价值。

我使用了循环函数来实现此目的,这是我使用的代码:

def wave(row):
    if (row["datetime"] <= pd.Timestamp("2019-11-16")) & (row["datetime"] >= pd.Timestamp("2020-02-28")):
        wave="before covid"
    elif (row["datetime"] <= pd.Timestamp("2020-03-01")) & (row["datetime"] >= pd.Timestamp("2020-06-15")):
        wave="1st wave"
    elif (row["datetime"] <= pd.Timestamp("2020-06-16"))  & (row["datetime"] >= pd.Timestamp("2020-09-30")):
        wave="between waves"
    elif (row["datetime"] <= pd.Timestamp("2020-10-01")) & (row["datetime"] >= pd.Timestamp("2021-01-15")):
        wave="2nd wave"

df["wave"]=df.apply(lambda row:wave(row),axis=1)

但是它给了我命名wave> wave ,但没有值。如何解决此问题并分类日期?

I have a dataset called df that looks like this:

providerfidpiddatetime
CHE-2232bfc9a622f43d5572021-09-26T23:18:00
CHE-223fff669e9295b82e22021-08-13T09:10:00

I wanted to create a new table called wave that has categorical values for a range of date from datetime. e.g. For the date from 16th of November 2019 until 28th of February 2020, it gives a value before covid and so on.

I used a loop function to achieve this and this is the code I used:

def wave(row):
    if (row["datetime"] <= pd.Timestamp("2019-11-16")) & (row["datetime"] >= pd.Timestamp("2020-02-28")):
        wave="before covid"
    elif (row["datetime"] <= pd.Timestamp("2020-03-01")) & (row["datetime"] >= pd.Timestamp("2020-06-15")):
        wave="1st wave"
    elif (row["datetime"] <= pd.Timestamp("2020-06-16"))  & (row["datetime"] >= pd.Timestamp("2020-09-30")):
        wave="between waves"
    elif (row["datetime"] <= pd.Timestamp("2020-10-01")) & (row["datetime"] >= pd.Timestamp("2021-01-15")):
        wave="2nd wave"

df["wave"]=df.apply(lambda row:wave(row),axis=1)

But it gives me a column named wave but with no values. How do I fix this and categorise the date?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

勿忘初心 2025-01-31 12:46:43

您的功能需要返回某些东西。此外,您的日期比较也被倒置:

(row["datetime"] <= pd.Timestamp("2019-11-16")) & (row["datetime"] >= pd.Timestamp("2020-02-28"))

在2019年11月16日之前和 2020年2月28日之后的匹配日期发生。

您的功能应该看起来像:

def wave(row):
    wave = ""
    if (row["datetime"] >= pd.Timestamp("2019-11-16")) and (row["datetime"] <= pd.Timestamp("2020-02-28")):
        wave="before covid"
    elif (row["datetime"] >= pd.Timestamp("2020-03-01")) and (row["datetime"] <= pd.Timestamp("2020-06-15")):
        wave="1st wave"
    elif (row["datetime"] >= pd.Timestamp("2020-06-16"))  and (row["datetime"] <= pd.Timestamp("2020-09-30")):
        wave="between waves"
    elif (row["datetime"] >= pd.Timestamp("2020-10-01")) and (row["datetime"] <= pd.Timestamp("2021-01-15")):
        wave="2nd wave"
    elif (row["datetime"] >= pd.Timestamp("2021-01-16")):
        wave="after second wave"
    return wave

编辑:&amp;是一个角度操作员。对于逻辑表达式,请使用

Your function needs to return something. Also your dates comparisons are inverted:

(row["datetime"] <= pd.Timestamp("2019-11-16")) & (row["datetime"] >= pd.Timestamp("2020-02-28"))

would match dates that are before the 16th of November 2019 and at the same time after the 28th of February 2020... which of course never happens.

Your function should look like:

def wave(row):
    wave = ""
    if (row["datetime"] >= pd.Timestamp("2019-11-16")) and (row["datetime"] <= pd.Timestamp("2020-02-28")):
        wave="before covid"
    elif (row["datetime"] >= pd.Timestamp("2020-03-01")) and (row["datetime"] <= pd.Timestamp("2020-06-15")):
        wave="1st wave"
    elif (row["datetime"] >= pd.Timestamp("2020-06-16"))  and (row["datetime"] <= pd.Timestamp("2020-09-30")):
        wave="between waves"
    elif (row["datetime"] >= pd.Timestamp("2020-10-01")) and (row["datetime"] <= pd.Timestamp("2021-01-15")):
        wave="2nd wave"
    elif (row["datetime"] >= pd.Timestamp("2021-01-16")):
        wave="after second wave"
    return wave

Edit: also & is a bit-wise operator. For logical expressions use and.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文