有没有一种方法可以在熊猫 / numpy中使用上升逻辑进行正向填充?

发布于 2025-02-10 02:51:48 字数 798 浏览 4 评论 0原文

使用上升逻辑(不迭代行)向前填充的最弹性方法是什么?

输入:

import pandas as pd
import numpy as np

df = pd.DataFrame()

df['test'] = np.nan,np.nan,1,np.nan,np.nan,3,np.nan,np.nan,2,np.nan,6,np.nan,np.nan
df['desired_output'] = np.nan,np.nan,1,1,1,3,3,3,3,3,6,6,6

print (df)

输出:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

在“测试”列中,连续NAN的数量是随机的。

在“ Desired_output”列中,尝试仅使用上升值向前填充。同样,当遇到较低的值(第8行,值= 2.0)时,它们会被当前较高的值所覆盖。

谁能帮忙?提前致谢。

What is the most pandastic way to forward fill with ascending logic (without iterating over the rows)?

input:

import pandas as pd
import numpy as np

df = pd.DataFrame()

df['test'] = np.nan,np.nan,1,np.nan,np.nan,3,np.nan,np.nan,2,np.nan,6,np.nan,np.nan
df['desired_output'] = np.nan,np.nan,1,1,1,3,3,3,3,3,6,6,6

print (df)

output:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

In the 'test' column, the number of consecutive NaN's is random.

In the 'desired_output' column, trying to forward fill with ascending values only. Also, when lower values are encountered (row 8, value = 2.0 above), they are overwritten with the current higher value.

Can anyone help? Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

酒与心事 2025-02-17 02:51:48

您可以组合 >选择累积最大值和 替换NAN:

df['desired_output'] = df['test'].cummax().ffill()

输出:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

中间系列:

df['test'].cummax()

0     NaN
1     NaN
2     1.0
3     NaN
4     NaN
5     3.0
6     NaN
7     NaN
8     3.0
9     NaN
10    6.0
11    NaN
12    NaN
Name: test, dtype: float64

You can combine cummax to select the cumulative maximum value and ffill to replace the NaNs:

df['desired_output'] = df['test'].cummax().ffill()

output:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

intermediate Series:

df['test'].cummax()

0     NaN
1     NaN
2     1.0
3     NaN
4     NaN
5     3.0
6     NaN
7     NaN
8     3.0
9     NaN
10    6.0
11    NaN
12    NaN
Name: test, dtype: float64
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文