如何在几天内比较两列之间的日期并执行任务?

发布于 2025-01-19 14:15:08 字数 1471 浏览 0 评论 0原文

每当DF中有未知数时,我都想使用未知的交货日期,并检查DF2中最古老的交货日期(由CAR_PART分组)以查看它是否在 +-90天内匹配?如果日期匹配,请打印日期else to下一个未知。

data = {'car_part': ['100009','100093','100071','100033','100033','100043'],
        'car_number': ['UNKNOWN', 'X123-00027C', 'X123-00027C', 'UNKNOWN', 'X123-00148C', 'X123-00148C'],
        'delivery': ['11/20/2004', '12/17/2009', '7/27/2010', '11/1/2004', '9/5/2004', '11/10/2004'],
        'test': ['12/17/2009', '7/27/2010', '7/10/2020', '12/22/2006', '3/26/2007', '12/1/2007']}  

data2 = {'delivery': ['11/1/2004', '12/1/2004', '1/1/2005', '7/1/2006', '8/1/2006', '9/2/2006'], 
         'car_part': ['100009','100009','100009','100033','100033','100033']}  

df = pd.DataFrame(data)
print(df)
df2 = pd.DataFrame(data2)
print(df2)

df['delivery'] = df['delivery'].astype('datetime64[ns]')
df.sort_values(by = ['car_part', 'delivery', 'test'], ascending=[True, True, True])

df2['delivery'] = df2['delivery'].astype('datetime64[ns]')
df2.sort_values(by = ['car_part', 'delivery'], ascending=[True, True])

我已经尝试这样做

df["delivery"] = pd.to_datetime(df["delivery"])
df2["delivery"] = pd.to_datetime(df2["delivery"])
for index, row in df.iterrows():
    if row['car_number'] == "UNKNOWN":
        oldest_date = df["car_part"].map(df2.groupby("car_part")["delivery"].min())
        diff = (row['delivery']-oldest_date).days
        if diff<91:
            print(row['delivery']) 

,但是获得错误属性:“系列”对象没有属性“天”

Every time there is an UNKNOWN in df, I would like to use the UNKNOWN delivery date and check against the oldest delivery date (grouped by car_part) in df2 to see if it matches within +- 90 days range? If the date matches, then print the date else go to the next UNKNOWN.

data = {'car_part': ['100009','100093','100071','100033','100033','100043'],
        'car_number': ['UNKNOWN', 'X123-00027C', 'X123-00027C', 'UNKNOWN', 'X123-00148C', 'X123-00148C'],
        'delivery': ['11/20/2004', '12/17/2009', '7/27/2010', '11/1/2004', '9/5/2004', '11/10/2004'],
        'test': ['12/17/2009', '7/27/2010', '7/10/2020', '12/22/2006', '3/26/2007', '12/1/2007']}  

data2 = {'delivery': ['11/1/2004', '12/1/2004', '1/1/2005', '7/1/2006', '8/1/2006', '9/2/2006'], 
         'car_part': ['100009','100009','100009','100033','100033','100033']}  

df = pd.DataFrame(data)
print(df)
df2 = pd.DataFrame(data2)
print(df2)

df['delivery'] = df['delivery'].astype('datetime64[ns]')
df.sort_values(by = ['car_part', 'delivery', 'test'], ascending=[True, True, True])

df2['delivery'] = df2['delivery'].astype('datetime64[ns]')
df2.sort_values(by = ['car_part', 'delivery'], ascending=[True, True])

I've tried doing this

df["delivery"] = pd.to_datetime(df["delivery"])
df2["delivery"] = pd.to_datetime(df2["delivery"])
for index, row in df.iterrows():
    if row['car_number'] == "UNKNOWN":
        oldest_date = df["car_part"].map(df2.groupby("car_part")["delivery"].min())
        diff = (row['delivery']-oldest_date).days
        if diff<91:
            print(row['delivery']) 

but getting error AttributeError: 'Series' object has no attribute 'days'

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

静谧幽蓝 2025-01-26 14:15:08

使用此更改代码。我真的不理解最终的Ouptut和您在问什么,但是您的地图是错误的。由于您想使用相同的代码结构,因此地图行应该是这样的

df["delivery"] = pd.to_datetime(df["delivery"])
df2["delivery"] = pd.to_datetime(df2["delivery"])
for index, row in df.iterrows():
    if row['car_number'] == "UNKNOWN":
        oldest_date = df2[df["car_part"]==row["car_part"]].groupby("car_part")["delivery"].min().values[0]
        diff = (row['delivery']-oldest_date).days
        if diff<91:
            print(row['delivery']) 

Change your code with this. I don't really understand the final ouptut and what you are asking however your map is wrong. Since you want to use the same structure of code, the map line should be something like this

df["delivery"] = pd.to_datetime(df["delivery"])
df2["delivery"] = pd.to_datetime(df2["delivery"])
for index, row in df.iterrows():
    if row['car_number'] == "UNKNOWN":
        oldest_date = df2[df["car_part"]==row["car_part"]].groupby("car_part")["delivery"].min().values[0]
        diff = (row['delivery']-oldest_date).days
        if diff<91:
            print(row['delivery']) 
幸福还没到 2025-01-26 14:15:08

尝试:

  1. 使用groupbymin获取每个汽车零件的最早交付日期。
  2. df和最早的交货日期中找到交付的差异,并保存到diff
  3. 仅在汽车号未知并且交付时间在90天内时保留最旧的日期值从最古老的日期开始。
oldest = df["car_part"].map(df2.groupby("car_part")["delivery"].min())

df["oldest"] = oldest.where(df["car_number"].eq("UNKNOWN")&df["delivery"].sub(oldest).abs().dt.days.le(90))

>>> df
  car_part   car_number   delivery        test     oldest
0   100009      UNKNOWN 2004-11-20  12/17/2009 2004-11-01
1   100093  X123-00027C 2009-12-17   7/27/2010        NaT
2   100071  X123-00027C 2010-07-27   7/10/2020        NaT
3   100033      UNKNOWN 2004-11-01  12/22/2006        NaT
4   100033  X123-00148C 2004-09-05   3/26/2007        NaT
5   100043  X123-00148C 2004-11-10   12/1/2007        NaT

Try:

  1. Use groupby and min to get the earliest delivery date for each car part.
  2. Find the difference between the delivery in df and the earliest delivery date and save to diff
  3. Keep the oldest date value only when the car number is UNKNOWN and the delivery is within 90 days from the oldest date.
oldest = df["car_part"].map(df2.groupby("car_part")["delivery"].min())

df["oldest"] = oldest.where(df["car_number"].eq("UNKNOWN")&df["delivery"].sub(oldest).abs().dt.days.le(90))

>>> df
  car_part   car_number   delivery        test     oldest
0   100009      UNKNOWN 2004-11-20  12/17/2009 2004-11-01
1   100093  X123-00027C 2009-12-17   7/27/2010        NaT
2   100071  X123-00027C 2010-07-27   7/10/2020        NaT
3   100033      UNKNOWN 2004-11-01  12/22/2006        NaT
4   100033  X123-00148C 2004-09-05   3/26/2007        NaT
5   100043  X123-00148C 2004-11-10   12/1/2007        NaT
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文