插值考虑了多少行?

发布于 2025-01-17 00:55:21 字数 196 浏览 0 评论 0原文

pandas 的 DataFrame.interpolation() 如何与它考虑的行数相关:

  1. 它只是 NaN 之前的行和紧随其后的行吗?
  2. 或者它是整个 DataFrame(它如何在 100 万行中工作?)
  3. 或者另一种方式(请解释)

编辑: (理想情况下使用 method=='多项式')

How does pandas' DataFrame.interpolation() work in relation to the amount of rows it considers:

  1. is it just the row before the NaNs and the row right after?
  2. Or is it the whole DataFrame (how does that work at 1 million rows?)
  3. Or another way (please explain)

Edit:
(with method=='polynomial' ideally)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无需解释 2025-01-24 00:55:21

使用 method='polynomial' 时,DataFrame.interpolate() 从第一个非 NaN 值开始,在最后一个非 NaN 值处停止,留下前导和尾随 NaN未更改:

>>> n = np.nan
>>> df = pd.DataFrame([n,n,4,2,4,n,n,2,n,n,n,n,n,])
>>> df
      0
0   NaN
1   NaN
2   4.0
3   2.0
4   4.0
5   NaN
6   NaN
7   2.0
8   NaN
9   NaN
10  NaN
11  NaN
12  NaN

>>> df.interpolate(method='polynomial', order=2)
           0
0        NaN  <--- 
1        NaN  <--- first 2 NaN values are unchanged
2   4.000000
3   2.000000
4   4.000000
5   5.076923
6   4.410256
7   2.000000  <--- last non-NaN value
8        NaN  <--- this and subsequent NaN values unchanged
9        NaN
10       NaN
11       NaN
12       NaN

如果您希望填充前导和尾随 NaN 值,只需使用 bfillffill

>>> df.interpolate(method='polynomial', order=2).bfill().ffill()
           0
0   4.000000
1   4.000000
2   4.000000
3   2.000000
4   4.000000
5   5.076923
6   4.410256
7   2.000000
8   2.000000
9   2.000000
10  2.000000
11  2.000000
12  2.000000

With method='polynomial', DataFrame.interpolate() starts with the first non-NaN values, and stops at the last non-NaN value, leaving leading and trailing NaNs unchanged:

>>> n = np.nan
>>> df = pd.DataFrame([n,n,4,2,4,n,n,2,n,n,n,n,n,])
>>> df
      0
0   NaN
1   NaN
2   4.0
3   2.0
4   4.0
5   NaN
6   NaN
7   2.0
8   NaN
9   NaN
10  NaN
11  NaN
12  NaN

>>> df.interpolate(method='polynomial', order=2)
           0
0        NaN  <--- 
1        NaN  <--- first 2 NaN values are unchanged
2   4.000000
3   2.000000
4   4.000000
5   5.076923
6   4.410256
7   2.000000  <--- last non-NaN value
8        NaN  <--- this and subsequent NaN values unchanged
9        NaN
10       NaN
11       NaN
12       NaN

If you'd like the leading and trailing NaN values filled, just use bfill and ffill:

>>> df.interpolate(method='polynomial', order=2).bfill().ffill()
           0
0   4.000000
1   4.000000
2   4.000000
3   2.000000
4   4.000000
5   5.076923
6   4.410256
7   2.000000
8   2.000000
9   2.000000
10  2.000000
11  2.000000
12  2.000000
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文