当前位置：文江博客话题详情

Python NaN numpy

将 nan 值转换为零

发布于 2024-10-19 10:38:14 字数 759 浏览 2 评论 0 原文

我有一个 2D numpy 数组。该数组中的某些值是 NaN。我想使用这个数组执行某些操作。例如，考虑一下数组：

[[   0.   43.   67.    0.   38.]
 [ 100.   86.   96.  100.   94.]
 [  76.   79.   83.   89.   56.]
 [  88.   NaN   67.   89.   81.]
 [  94.   79.   67.   89.   69.]
 [  88.   79.   58.   72.   63.]
 [  76.   79.   71.   67.   56.]
 [  71.   71.   NaN   56.  100.]]

我尝试一次获取每一行，以相反的顺序对其进行排序，以从该行中获取最多 3 个值并取它们的平均值。我尝试的代码是：

# nparr is a 2D numpy array
for entry in nparr:
    sortedentry = sorted(entry, reverse=True)
    highest_3_values = sortedentry[:3]
    avg_highest_3 = float(sum(highest_3_values)) / 3

这不适用于包含 NaN 的行。我的问题是，是否有一种快速方法可以将 2D numpy 数组中的所有 NaN 值转换为零，以便我在排序和尝试做的其他事情上没有问题。

原文

I have a 2D numpy array. Some of the values in this array are NaN. I want to perform certain operations using this array. For example consider the array:

[[   0.   43.   67.    0.   38.]
 [ 100.   86.   96.  100.   94.]
 [  76.   79.   83.   89.   56.]
 [  88.   NaN   67.   89.   81.]
 [  94.   79.   67.   89.   69.]
 [  88.   79.   58.   72.   63.]
 [  76.   79.   71.   67.   56.]
 [  71.   71.   NaN   56.  100.]]

I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:

# nparr is a 2D numpy array
for entry in nparr:
    sortedentry = sorted(entry, reverse=True)
    highest_3_values = sortedentry[:3]
    avg_highest_3 = float(sum(highest_3_values)) / 3

This does not work for rows containing NaN. My question is, is there a quick way to convert all NaN values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你穿错了嫁妆 2024-10-26 10:38:14

其中 A 是二维数组：

import numpy as np
A[np.isnan(A)] = 0

函数 isnan 生成一个 bool 数组，指示 NaN 值的位置。布尔数组可以用来索引相同形状的数组。把它想象成一个面具。

Where A is your 2D array:

import numpy as np
A[np.isnan(A)] = 0

The function isnan produces a bool array indicating where the NaN values are. A boolean array can by used to index an array of the same shape. Think of it like a mask.

回复收藏 0 原文

百思不得你姐 2024-10-26 10:38:14

这应该有效：

from numpy import *

a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

在上面的情况下，其中_are_NaNs 是：

In [12]: where_are_NaNs
Out[12]: 
array([[False, False, False],
       [False, False,  True]], dtype=bool)

关于效率的补充。下面的示例使用 numpy 1.21.2 运行，

>>> aa = np.random.random(1_000_000)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit a[np.isnan(a)] = 0
536 µs ± 8.11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.where(np.isnan(a), 0, a)
2.38 ms ± 27.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=True)
8.11 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=False)
3.8 ms ± 70.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

因此 a[np.isnan(a)] = 0 速度更快。

This should work:

from numpy import *

a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

In the above case where_are_NaNs is:

In [12]: where_are_NaNs
Out[12]: 
array([[False, False, False],
       [False, False,  True]], dtype=bool)

A complement about efficiency. The examples below were run with numpy 1.21.2

>>> aa = np.random.random(1_000_000)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit a[np.isnan(a)] = 0
536 µs ± 8.11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.where(np.isnan(a), 0, a)
2.38 ms ± 27.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=True)
8.11 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=False)
3.8 ms ± 70.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In consequence a[np.isnan(a)] = 0 is faster.

回复收藏 0 原文

随梦而飞# 2024-10-26 10:38:14

nan_to_num() 怎么样？

回复收藏 0 原文

北恋 2024-10-26 10:38:14

您可以使用 np.where 查找 NaN 的位置：

import numpy as np

a = np.array([[   0,   43,   67,    0,   38],
              [ 100,   86,   96,  100,   94],
              [  76,   79,   83,   89,   56],
              [  88,   np.nan,   67,   89,   81],
              [  94,   79,   67,   89,   69],
              [  88,   79,   58,   72,   63],
              [  76,   79,   71,   67,   56],
              [  71,   71,   np.nan,   56,  100]])

b = np.where(np.isnan(a), 0, a)

In [20]: b
Out[20]: 
array([[   0.,   43.,   67.,    0.,   38.],
       [ 100.,   86.,   96.,  100.,   94.],
       [  76.,   79.,   83.,   89.,   56.],
       [  88.,    0.,   67.,   89.,   81.],
       [  94.,   79.,   67.,   89.,   69.],
       [  88.,   79.,   58.,   72.,   63.],
       [  76.,   79.,   71.,   67.,   56.],
       [  71.,   71.,    0.,   56.,  100.]])

You could use np.where to find where you have NaN:

import numpy as np

a = np.array([[   0,   43,   67,    0,   38],
              [ 100,   86,   96,  100,   94],
              [  76,   79,   83,   89,   56],
              [  88,   np.nan,   67,   89,   81],
              [  94,   79,   67,   89,   69],
              [  88,   79,   58,   72,   63],
              [  76,   79,   71,   67,   56],
              [  71,   71,   np.nan,   56,  100]])

b = np.where(np.isnan(a), 0, a)

In [20]: b
Out[20]: 
array([[   0.,   43.,   67.,    0.,   38.],
       [ 100.,   86.,   96.,  100.,   94.],
       [  76.,   79.,   83.,   89.,   56.],
       [  88.,    0.,   67.,   89.,   81.],
       [  94.,   79.,   67.,   89.,   69.],
       [  88.,   79.,   58.,   72.,   63.],
       [  76.,   79.,   71.,   67.,   56.],
       [  71.,   71.,    0.,   56.,  100.]])

回复收藏 0 原文

仙女山的月亮 2024-10-26 10:38:14

drake的答案使用nan_to_num：

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1.,  2.,  3.],
       [ 0.,  3.,  0.]])

A code example for drake's answer to use nan_to_num:

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1.,  2.,  3.],
       [ 0.,  3.,  0.]])

回复收藏 0 原文

柏拉图鍀咏恒 2024-10-26 10:38:14

您可以使用 numpy.nan_to_num ：

numpy.nan_to_num(x)：将nan替换为零，将inf替换为有限数。

示例（参见文档）：

>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
        -1.28000000e+002,   1.28000000e+002])

You can use numpy.nan_to_num :

numpy.nan_to_num(x) : Replace nan with zero and inf with finite numbers.

Example (see doc) :

>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
        -1.28000000e+002,   1.28000000e+002])

回复收藏 0 原文

半枫 2024-10-26 10:38:14

nan 永远不等于 nan

if z!=z:z=0

所以对于二维数组

for entry in nparr:
    if entry!=entry:entry=0

nan is never equal to nan

if z!=z:z=0

so for a 2D array

for entry in nparr:
    if entry!=entry:entry=0

回复收藏 0 原文

甜｀诱少女 2024-10-26 10:38:14

您可以使用 lambda 函数，这是一维数组的示例：

import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)

这将为您提供结果：

[0, 2, 3]

You can use lambda function, an example for 1D array:

import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)

This will give you the result:

[0, 2, 3]

回复收藏 0 原文

夜声 2024-10-26 10:38:14

出于您的目的，如果所有项目都存储为 str 并且您只需使用您正在使用的排序，然后检查第一个元素并将其替换为“0”

>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
...     n[0] = '0'
... 
>>> n
['0', '89', '88', '81', '67']

For your purposes, if all the items are stored as str and you just use sorted as you are using and then check for the first element and replace it with '0'

>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
...     n[0] = '0'
... 
>>> n
['0', '89', '88', '81', '67']

回复收藏 0 原文

~没有更多了~