基于熊猫DF中的值返回日期时间

发布于 2025-01-17 22:05:14 字数 1074 浏览 0 评论 0原文

我正在工作从具有“ anual_quarters”的数据创建数据时间列,作为每个季度每个数据点的列名和值。

我正在考虑创建自定义功能并使用.Apply以返回可以使用的数据。但是我似乎无法使用Iterrows()()和IterItems()

Heres在数据的每一行/col上编写一个迭代的功能。

from datetime import datetime
def get_recent_orders(merged_data):
    for index, row in merged_data.iteritems():
        if row['q4_sales_2021'] > 0:
            return datetime(2021, 10, 1)
        elif row['q3_sales_2021'] > 0:
            return datetime(2021, 7, 1)
        elif row['q2_sales_2021'] > 0:
            return datetime(2021, 4, 1)
        elif row['q1_sales_2021'] > 0:
            return datetime(2021, 1, 1)
        elif row['q4_sales_2020'] > 0:
            return datetime(2020, 10, 1)
        elif row['q3_sales_2020'] > 0:
            return datetime(2020, 7, 1)
        elif row['q2_sales_2020'] > 0:
            return datetime(2020, 4, 1)
        elif row['q1_sales_2020'] > 0:
            return datetime(2020, 1, 1)
        else:
            return None
merged_data['last_order'] = merged_data.apply(get_recent_orders, axis=0)

I'm working creating a data time column from a data that has "anual_quarters" as the column names and values for each data point of each quarter.

I was thinking of creating a custom function and using .apply to return a data I could use. But I can't seem to write a function the iterates over each row/col of the data using iterrows() and iteritems()

Heres the last thing I tried.

from datetime import datetime
def get_recent_orders(merged_data):
    for index, row in merged_data.iteritems():
        if row['q4_sales_2021'] > 0:
            return datetime(2021, 10, 1)
        elif row['q3_sales_2021'] > 0:
            return datetime(2021, 7, 1)
        elif row['q2_sales_2021'] > 0:
            return datetime(2021, 4, 1)
        elif row['q1_sales_2021'] > 0:
            return datetime(2021, 1, 1)
        elif row['q4_sales_2020'] > 0:
            return datetime(2020, 10, 1)
        elif row['q3_sales_2020'] > 0:
            return datetime(2020, 7, 1)
        elif row['q2_sales_2020'] > 0:
            return datetime(2020, 4, 1)
        elif row['q1_sales_2020'] > 0:
            return datetime(2020, 1, 1)
        else:
            return None
merged_data['last_order'] = merged_data.apply(get_recent_orders, axis=0)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

祁梦 2025-01-24 22:05:14

使用numpy.select:

import numpy as np

conditions = [merged_data['q4_sales_2021'].gt(0), 
              merged_data['q3_sales_2021'].gt(0),
              merged_data['q2_sales_2021'].gt(0),
              merged_data['q1_sales_2021'].gt(0),
              merged_data['q4_sales_2020'].gt(0), 
              merged_data['q3_sales_2020'].gt(0),
              merged_data['q2_sales_2020'].gt(0),
              merged_data['q1_sales_2020'].gt(0)]

choices = [datetime(2021, 10, 1),
           datetime(2021, 7, 1),
           datetime(2021, 4, 1),
           datetime(2021, 1, 1),
           datetime(2020, 10, 1),
           datetime(2020, 7, 1),
           datetime(2020, 4, 1),
           datetime(2020, 1, 1)]

merged_data["last_order"] = np.select(conditions, choices, None)

Use numpy.select:

import numpy as np

conditions = [merged_data['q4_sales_2021'].gt(0), 
              merged_data['q3_sales_2021'].gt(0),
              merged_data['q2_sales_2021'].gt(0),
              merged_data['q1_sales_2021'].gt(0),
              merged_data['q4_sales_2020'].gt(0), 
              merged_data['q3_sales_2020'].gt(0),
              merged_data['q2_sales_2020'].gt(0),
              merged_data['q1_sales_2020'].gt(0)]

choices = [datetime(2021, 10, 1),
           datetime(2021, 7, 1),
           datetime(2021, 4, 1),
           datetime(2021, 1, 1),
           datetime(2020, 10, 1),
           datetime(2020, 7, 1),
           datetime(2020, 4, 1),
           datetime(2020, 1, 1)]

merged_data["last_order"] = np.select(conditions, choices, None)
小鸟爱天空丶 2025-01-24 22:05:14

使用np.select:

condlist = [df['q4_sales_2021'] > 0,
            df['q3_sales_2021'] > 0,
            df['q2_sales_2021'] > 0,
            df['q1_sales_2021'] > 0,
            df['q4_sales_2020'] > 0,
            df['q3_sales_2020'] > 0,
            df['q2_sales_2020'] > 0,
            df['q1_sales_2020'] > 0]

choicelist = [datetime(2021, 10, 1),
              datetime(2021, 7, 1),
              datetime(2021, 4, 1),
              datetime(2021, 1, 1),
              datetime(2020, 10, 1),
              datetime(2020, 7, 1),
              datetime(2020, 4, 1),
              datetime(2020, 1, 1)]

default = pd.NAT

merged_data['last_order'] = np.select(condlist, choicelist, default)

With np.select:

condlist = [df['q4_sales_2021'] > 0,
            df['q3_sales_2021'] > 0,
            df['q2_sales_2021'] > 0,
            df['q1_sales_2021'] > 0,
            df['q4_sales_2020'] > 0,
            df['q3_sales_2020'] > 0,
            df['q2_sales_2020'] > 0,
            df['q1_sales_2020'] > 0]

choicelist = [datetime(2021, 10, 1),
              datetime(2021, 7, 1),
              datetime(2021, 4, 1),
              datetime(2021, 1, 1),
              datetime(2020, 10, 1),
              datetime(2020, 7, 1),
              datetime(2020, 4, 1),
              datetime(2020, 1, 1)]

default = pd.NAT

merged_data['last_order'] = np.select(condlist, choicelist, default)
寻找一个思念的角度 2025-01-24 22:05:14

您可以做到这一点:

df = pd.DataFrame(data={"annual_quarters": ["q4_sales_2021", "q3_sales_2021", "q2_sales_2021", "q1_sales_2021",
                                            "q4_sales_2020", "q3_sales_2020", "q2_sales_2020", "q1_sales_2020"]})

import datetime as dt

def get_recent_orders(quarter):
    month_list = [1, 4, 7, 10]
    month = month_list[int(str(quarter)[1])-1]
    year = int(str(quarter)[-4:])
    
    return dt.date(year, month, 1)

df["last_order"] = df["annual_quarters"].apply(get_recent_orders)

根据Q1、2、3或4的规定,每个月将是1月,APR,JUL或OCT字符串字符,最近4个字符的一年,day = 1。这样,根本不需要迭代行。

You could do this instead:

df = pd.DataFrame(data={"annual_quarters": ["q4_sales_2021", "q3_sales_2021", "q2_sales_2021", "q1_sales_2021",
                                            "q4_sales_2020", "q3_sales_2020", "q2_sales_2020", "q1_sales_2020"]})

import datetime as dt

def get_recent_orders(quarter):
    month_list = [1, 4, 7, 10]
    month = month_list[int(str(quarter)[1])-1]
    year = int(str(quarter)[-4:])
    
    return dt.date(year, month, 1)

df["last_order"] = df["annual_quarters"].apply(get_recent_orders)

As the month will be Jan, Apr, Jul or Oct depending on Q1, 2, 3 or 4, you can return a datetime.date by finding the month value from the second string character, the year from the last 4 characters, and day = 1. This way, no need to iterate over rows at all.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文