回填和前填充Nans和Zeros

发布于 2025-02-01 15:54:26 字数 683 浏览 2 评论 0原文

我正在尝试回到/前进,以填补员工的工作经验(年)。我要实现的目标是:

员工200

2019 - 3年,2018年 - 2年,2017年 - 1年

员工300

keep as Nan

Employee 400

2018 - 3年,2017年 - 2年 - 2年

员工500

2018 - 6年,2017年,2017 - 5年年,2016年至4年,

我真的很努力地以-1(+1)的增量增量将其带到回填(前填充)。如果非nan/Zero值在中间,则与员工500相比,甚至更棘手。

df_test = pd.DataFrame({'DeptID':[0,0,0,1,1,1,2,2,2],
                        'Employee':[200, 200, 200, 300, 400, 400, 500, 500, 500],
                        'Year':[2017, 2018, 2019, 2016, 2017, 2018, 2016, 2017, 2018],
                        'Experience':[np.nan , np.nan, 3, np.nan, 2, np.nan, 0, 5, 0]
                       })

I am trying to back/forward fill the work experience (years) of employees. What I am trying to achieve is:

Employee 200

2019 - 3 yrs, 2018 - 2 yrs, 2017 - 1 yr

Employee 300

Keep as Nan

Employee 400

2018 - 3 yrs, 2017 - 2 yrs

Employee 500

2018 - 6 yrs, 2017 - 5 yrs, 2016 - 4 yrs

I am really struggling to get it to backfill (forwardfill) by increments of -1 (+1). Even trickier if the non-NaN/zero value is in the middle as in the case of employee 500.

df_test = pd.DataFrame({'DeptID':[0,0,0,1,1,1,2,2,2],
                        'Employee':[200, 200, 200, 300, 400, 400, 500, 500, 500],
                        'Year':[2017, 2018, 2019, 2016, 2017, 2018, 2016, 2017, 2018],
                        'Experience':[np.nan , np.nan, 3, np.nan, 2, np.nan, 0, 5, 0]
                       })

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

っ左 2025-02-08 15:54:26

假设每个员工都有单个非零和非nan体验,请尝试以下

df_test = pd.DataFrame({'DeptID':[0,0,0,1,1,1,2,2,2],
                        'Employee':[200, 200, 200, 300, 400, 400, 500, 500, 500],
                        'Year':[2017, 2018, 2019, 2016, 2017, 2018, 2016, 2017, 2018],
                        'Experience':[np.nan , np.nan, 3, np.nan, 2, np.nan, 0, 5, 0]
                       })

# find the last nonzero, non-nan value for each employee
nonzero = df_test[df_test.Experience.ne(0) & df_test.Experience.notna()].drop_duplicates('Employee', keep='last').reset_index().set_index('Employee')
# map the difference between experience and index of the nonzero value of the employees to employee column
# add it to index
df_test['Experience'] = df_test.index + df_test.Employee.map(nonzero.Experience - nonzero['index'])
df_test

”

Assuming there's a single nonzero and non-nan experience for each employee, try this

df_test = pd.DataFrame({'DeptID':[0,0,0,1,1,1,2,2,2],
                        'Employee':[200, 200, 200, 300, 400, 400, 500, 500, 500],
                        'Year':[2017, 2018, 2019, 2016, 2017, 2018, 2016, 2017, 2018],
                        'Experience':[np.nan , np.nan, 3, np.nan, 2, np.nan, 0, 5, 0]
                       })

# find the last nonzero, non-nan value for each employee
nonzero = df_test[df_test.Experience.ne(0) & df_test.Experience.notna()].drop_duplicates('Employee', keep='last').reset_index().set_index('Employee')
# map the difference between experience and index of the nonzero value of the employees to employee column
# add it to index
df_test['Experience'] = df_test.index + df_test.Employee.map(nonzero.Experience - nonzero['index'])
df_test

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文