有没有一种方法可以在熊猫 / numpy中使用上升逻辑进行正向填充？

发布于 2025-02-10 02:51:48 字数 798 浏览 4 评论 0原文

使用上升逻辑（不迭代行）向前填充的最弹性方法是什么？

输入：

import pandas as pd
import numpy as np

df = pd.DataFrame()

df['test'] = np.nan,np.nan,1,np.nan,np.nan,3,np.nan,np.nan,2,np.nan,6,np.nan,np.nan
df['desired_output'] = np.nan,np.nan,1,1,1,3,3,3,3,3,6,6,6

print (df)

输出：

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

在“测试”列中，连续NAN的数量是随机的。

在“ Desired_output”列中，尝试仅使用上升值向前填充。同样，当遇到较低的值（第8行，值= 2.0）时，它们会被当前较高的值所覆盖。

谁能帮忙？提前致谢。

原文

What is the most pandastic way to forward fill with ascending logic (without iterating over the rows)?

input:

import pandas as pd
import numpy as np

df = pd.DataFrame()

df['test'] = np.nan,np.nan,1,np.nan,np.nan,3,np.nan,np.nan,2,np.nan,6,np.nan,np.nan
df['desired_output'] = np.nan,np.nan,1,1,1,3,3,3,3,3,6,6,6

print (df)

output:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

In the 'test' column, the number of consecutive NaN's is random.

In the 'desired_output' column, trying to forward fill with ascending values only. Also, when lower values are encountered (row 8, value = 2.0 above), they are overwritten with the current higher value.

Can anyone help? Thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

酒与心事 2025-02-17 02:51:48

您可以组合 >选择累积最大值和替换NAN：

df['desired_output'] = df['test'].cummax().ffill()

输出：

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

中间系列：

df['test'].cummax()

0     NaN
1     NaN
2     1.0
3     NaN
4     NaN
5     3.0
6     NaN
7     NaN
8     3.0
9     NaN
10    6.0
11    NaN
12    NaN
Name: test, dtype: float64

You can combine cummax to select the cumulative maximum value and ffill to replace the NaNs:

df['desired_output'] = df['test'].cummax().ffill()

output:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

intermediate Series:

df['test'].cummax()

0     NaN
1     NaN
2     1.0
3     NaN
4     NaN
5     3.0
6     NaN
7     NaN
8     3.0
9     NaN
10    6.0
11    NaN
12    NaN
Name: test, dtype: float64

回复收藏 0 原文

~没有更多了~