如何从 NumPy 数组中删除所有零元素?
我有一个 1 级 numpy.array
,我想制作一个箱线图。但是,我想排除数组中所有等于零的值。目前,我通过循环数组并将值复制到新数组(如果不等于零)来解决这个问题。然而,由于该数组由 86 000 000 个值组成,而且我必须多次执行此操作,因此需要很大的耐心。
有没有更智能的方法来做到这一点?
I have a rank-1 numpy.array
of which I want to make a boxplot. However, I want to exclude all values equal to zero in the array. Currently, I solved this by looping the array and copy the value to a new array if not equal to zero. However, as the array consists of 86 000 000 values and I have to do this multiple times, this takes a lot of patience.
Is there a more intelligent way to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
对于 NumPy 数组
a
,您可以使用它来提取不等于零的值。
For a NumPy array
a
, you can useto extract the values not equal to zero.
在这种情况下,您想要使用屏蔽数组,它保持数组的形状,并且所有 numpy 和 matplotlib 函数都会自动识别它。
This is a case where you want to use masked arrays, it keeps the shape of your array and it is automatically recognized by all numpy and matplotlib functions.
我决定比较这里提到的不同方法的运行时间。为此,我使用了我的库
simple_benchmark
。使用 array[array != 0] 的布尔索引似乎是最快(也是最短)的解决方案。
对于较小的数组,与其他方法相比,MaskedArray 方法非常慢,但与布尔索引方法一样快。然而,对于中等大小的数组,它们之间没有太大区别。
这是我使用过的代码:
I decided to compare the runtime of the different approaches mentioned here. I've used my library
simple_benchmark
for this.The boolean indexing with
array[array != 0]
seems to be the fastest (and shortest) solution.For smaller arrays the MaskedArray approach is very slow compared to the other approaches however is as fast as the boolean indexing approach. However for moderately sized arrays there is not much difference between them.
Here is the code I've used:
您可以使用布尔数组进行索引。对于 NumPy 数组
A
:您可以使用 布尔数组索引如上,
bool
类型转换,np.nonzero
或np.where
。以下是一些性能基准测试:You can index with a Boolean array. For a NumPy array
A
:You can use Boolean array indexing as above,
bool
type conversion,np.nonzero
, ornp.where
. Here's some performance benchmarking:我想建议您在这种情况下简单地使用 NaN,在这种情况下,您希望忽略某些值,但仍希望使过程统计尽可能有意义。所以
I would like to suggest you to simply utilize
NaN
for cases like this, where you'll like to ignore some values, but still want to keep the procedure statistical as meaningful as possible. So一行简单的代码可以获得一个排除所有“0”值的数组:
示例:
A simple line of code can get you an array that excludes all '0' values:
example:
[i for i in Array if i != 0.0]
如果数字是浮点数或
[i for i in SICER if i != 0]
如果数字是 int。[i for i in Array if i != 0.0]
if the numbers are floator
[i for i in SICER if i != 0]
if the numbers are int.