用先前的单元格代替列中的值

发布于 2025-02-07 23:18:20 字数 2683 浏览 1 评论 0原文

我想将数据框架中的列替换为NAN,然后将NAN替换为上一行的值。这是我目前正在使用的代码:

import pandas as pd

file = ("C:/Users/eestrada/Desktop/Copy of Routing State Reason.xls")
finalfile = ("C:/Users/eestrada/Desktop/Copy of Routing State Reason.csv")
df = pd.read_excel(file)
date = df.iat[1,1]
date = date
date = date[0:9]
#abb = ['Jan *','Feb *','Mar *','Apr *','May *','Jun *','Jul *','Aug *','Sep *','Oct *','Nov *','Dec *']

df = pd.read_csv(finalfile)
df= df.drop([0,1,2,3,4,5,6,7])
df.columns = ['Name','UserID','Reason','PresenceState','Total_Time']

这就是我的数据的样子:

                  Name UserID                     Reason PresenceState Total_Time
8    Aguilar, Bertrand   2653                        NaN           NaN   00:41:10
9               Jun 15    NaN                        NaN           NaN   00:41:10
10                 NaN    NaN        Unavailable - Break          Away   00:09:03
11                 NaN    NaN        Unavailable - Lunch          Away   00:32:07
12                 NaN    NaN            Work - Outbound          Busy   00:00:00
13      Aguilar, Sarah   2546                        NaN           NaN   00:51:38
14              Jun 15    NaN                        NaN           NaN   00:51:38
15                 NaN    NaN        Unavailable - Break          Away   00:21:38
16                 NaN    NaN        Unavailable - Lunch          Away   00:30:00
17        Alcala, Jose   2584                        NaN           NaN   02:28:09
18              Jun 15    NaN                        NaN           NaN   02:28:09
19                 NaN    NaN  Unavailable - Service Out          Away   00:00:15
20                 NaN    NaN           Work - IT Issues          Busy   02:26:29
21                 NaN    NaN            Work - Outbound          Busy   00:01:25
22     Alonso, Eduardo   2648                        NaN           NaN   00:55:32
23              Jun 15    NaN                        NaN           NaN   00:55:32
24                 NaN    NaN        Unavailable - Break          Away   00:23:20
25                 NaN    NaN        Unavailable - Lunch          Away   00:32:10
26                 NaN    NaN        Work - 1on1Sessions          Busy   00:00:02
27  Andrade, Guillermo   2526                        NaN           NaN   01:00:49

现在,这是所需的输出:

”在此处输入图像描述

希望大家都可以帮助我:C

I want to replace values from a column in a dataframe to NaN and then replace Nan to the value from the previous row. This is the code that I'm currently using:

import pandas as pd

file = ("C:/Users/eestrada/Desktop/Copy of Routing State Reason.xls")
finalfile = ("C:/Users/eestrada/Desktop/Copy of Routing State Reason.csv")
df = pd.read_excel(file)
date = df.iat[1,1]
date = date
date = date[0:9]
#abb = ['Jan *','Feb *','Mar *','Apr *','May *','Jun *','Jul *','Aug *','Sep *','Oct *','Nov *','Dec *']

df = pd.read_csv(finalfile)
df= df.drop([0,1,2,3,4,5,6,7])
df.columns = ['Name','UserID','Reason','PresenceState','Total_Time']

And this is how my data looks like:

                  Name UserID                     Reason PresenceState Total_Time
8    Aguilar, Bertrand   2653                        NaN           NaN   00:41:10
9               Jun 15    NaN                        NaN           NaN   00:41:10
10                 NaN    NaN        Unavailable - Break          Away   00:09:03
11                 NaN    NaN        Unavailable - Lunch          Away   00:32:07
12                 NaN    NaN            Work - Outbound          Busy   00:00:00
13      Aguilar, Sarah   2546                        NaN           NaN   00:51:38
14              Jun 15    NaN                        NaN           NaN   00:51:38
15                 NaN    NaN        Unavailable - Break          Away   00:21:38
16                 NaN    NaN        Unavailable - Lunch          Away   00:30:00
17        Alcala, Jose   2584                        NaN           NaN   02:28:09
18              Jun 15    NaN                        NaN           NaN   02:28:09
19                 NaN    NaN  Unavailable - Service Out          Away   00:00:15
20                 NaN    NaN           Work - IT Issues          Busy   02:26:29
21                 NaN    NaN            Work - Outbound          Busy   00:01:25
22     Alonso, Eduardo   2648                        NaN           NaN   00:55:32
23              Jun 15    NaN                        NaN           NaN   00:55:32
24                 NaN    NaN        Unavailable - Break          Away   00:23:20
25                 NaN    NaN        Unavailable - Lunch          Away   00:32:10
26                 NaN    NaN        Work - 1on1Sessions          Busy   00:00:02
27  Andrade, Guillermo   2526                        NaN           NaN   01:00:49

Now this is the desired output:

enter image description here

Hope you all can help me :c

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

氛圍 2025-02-14 23:18:20

一种方法:

df["UserID"] = df["UserID"].ffill()
df["Name"] = df.groupby("UserID")["Name"].transform("first")

iiuc,使用ffillgroupby.transform:输出的

                  Name  UserID                     Reason PresenceState  \
8    Aguilar, Bertrand  2653.0                        NaN           NaN   
9    Aguilar, Bertrand  2653.0                        NaN           NaN   
10   Aguilar, Bertrand  2653.0        Unavailable - Break          Away   
11   Aguilar, Bertrand  2653.0        Unavailable - Lunch          Away   
12   Aguilar, Bertrand  2653.0            Work - Outbound          Busy   
13      Aguilar, Sarah  2546.0                        NaN           NaN   
14      Aguilar, Sarah  2546.0                        NaN           NaN   
15      Aguilar, Sarah  2546.0        Unavailable - Break          Away   
16      Aguilar, Sarah  2546.0        Unavailable - Lunch          Away   
17        Alcala, Jose  2584.0                        NaN           NaN   
18        Alcala, Jose  2584.0                        NaN           NaN   
19        Alcala, Jose  2584.0  Unavailable - Service Out          Away   
20        Alcala, Jose  2584.0           Work - IT Issues          Busy   
21        Alcala, Jose  2584.0            Work - Outbound          Busy   
22     Alonso, Eduardo  2648.0                        NaN           NaN   
23     Alonso, Eduardo  2648.0                        NaN           NaN   
24     Alonso, Eduardo  2648.0        Unavailable - Break          Away   
25     Alonso, Eduardo  2648.0        Unavailable - Lunch          Away   
26     Alonso, Eduardo  2648.0        Work - 1on1Sessions          Busy   
27  Andrade, Guillermo  2526.0                        NaN           NaN   

   Total_Time  
8    00:41:10  
9    00:41:10  
10   00:09:03  
11   00:32:07  
12   00:00:00  
13   00:51:38  
14   00:51:38  
15   00:21:38  
16   00:30:00  
17   02:28:09  
18   02:28:09  
19   00:00:15  
20   02:26:29  
21   00:01:25  
22   00:55:32  
23   00:55:32  
24   00:23:20  
25   00:32:10  
26   00:00:02  
27   01:00:49  

IIUC, one way using ffill with groupby.transform:

df["UserID"] = df["UserID"].ffill()
df["Name"] = df.groupby("UserID")["Name"].transform("first")

Output:

                  Name  UserID                     Reason PresenceState  \
8    Aguilar, Bertrand  2653.0                        NaN           NaN   
9    Aguilar, Bertrand  2653.0                        NaN           NaN   
10   Aguilar, Bertrand  2653.0        Unavailable - Break          Away   
11   Aguilar, Bertrand  2653.0        Unavailable - Lunch          Away   
12   Aguilar, Bertrand  2653.0            Work - Outbound          Busy   
13      Aguilar, Sarah  2546.0                        NaN           NaN   
14      Aguilar, Sarah  2546.0                        NaN           NaN   
15      Aguilar, Sarah  2546.0        Unavailable - Break          Away   
16      Aguilar, Sarah  2546.0        Unavailable - Lunch          Away   
17        Alcala, Jose  2584.0                        NaN           NaN   
18        Alcala, Jose  2584.0                        NaN           NaN   
19        Alcala, Jose  2584.0  Unavailable - Service Out          Away   
20        Alcala, Jose  2584.0           Work - IT Issues          Busy   
21        Alcala, Jose  2584.0            Work - Outbound          Busy   
22     Alonso, Eduardo  2648.0                        NaN           NaN   
23     Alonso, Eduardo  2648.0                        NaN           NaN   
24     Alonso, Eduardo  2648.0        Unavailable - Break          Away   
25     Alonso, Eduardo  2648.0        Unavailable - Lunch          Away   
26     Alonso, Eduardo  2648.0        Work - 1on1Sessions          Busy   
27  Andrade, Guillermo  2526.0                        NaN           NaN   

   Total_Time  
8    00:41:10  
9    00:41:10  
10   00:09:03  
11   00:32:07  
12   00:00:00  
13   00:51:38  
14   00:51:38  
15   00:21:38  
16   00:30:00  
17   02:28:09  
18   02:28:09  
19   00:00:15  
20   02:26:29  
21   00:01:25  
22   00:55:32  
23   00:55:32  
24   00:23:20  
25   00:32:10  
26   00:00:02  
27   01:00:49  
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文