Python 中的 MATLAB 风格的 find() 函数

发布于 2024-11-06 03:54:52 字数 856 浏览 2 评论 0原文

在 MATLAB 中,很容易找到满足特定条件的值的索引:

>> a = [1,2,3,1,2,3,1,2,3];
>> find(a > 2)     % find the indecies where this condition is true
[3, 6, 9]          % (MATLAB uses 1-based indexing)
>> a(find(a > 2))  % get the values at those locations
[3, 3, 3]

在 Python 中执行此操作的最佳方法是什么?

到目前为止,我已经提出了以下建议。只是获取值:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> [val for val in a if val > 2]
[3, 3, 3]

但是如果我想要每个值的索引,那就有点复杂了:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> inds = [i for (i, val) in enumerate(a) if val > 2]
>>> inds
[2, 5, 8]
>>> [val for (i, val) in enumerate(a) if i in inds]
[3, 3, 3]

在Python中是否有更好的方法来做到这一点,特别是对于任意条件(不仅仅是“val > 2”)?

我在 NumPy 中找到了与 MATLAB“find”等效的函数,但我目前无法访问这些库。

In MATLAB it is easy to find the indices of values that meet a particular condition:

>> a = [1,2,3,1,2,3,1,2,3];
>> find(a > 2)     % find the indecies where this condition is true
[3, 6, 9]          % (MATLAB uses 1-based indexing)
>> a(find(a > 2))  % get the values at those locations
[3, 3, 3]

What would be the best way to do this in Python?

So far, I have come up with the following. To just get the values:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> [val for val in a if val > 2]
[3, 3, 3]

But if I want the index of each of those values it's a bit more complicated:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> inds = [i for (i, val) in enumerate(a) if val > 2]
>>> inds
[2, 5, 8]
>>> [val for (i, val) in enumerate(a) if i in inds]
[3, 3, 3]

Is there a better way to do this in Python, especially for arbitrary conditions (not just 'val > 2')?

I found functions equivalent to MATLAB 'find' in NumPy but I currently do not have access to those libraries.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

一杆小烟枪 2024-11-13 03:54:52

在 numpy 中你有 where

>> import numpy as np
>> x = np.random.randint(0, 20, 10)
>> x
array([14, 13,  1, 15,  8,  0, 17, 11, 19, 13])
>> np.where(x > 10)
(array([0, 1, 3, 6, 7, 8, 9], dtype=int64),)

in numpy you have where :

>> import numpy as np
>> x = np.random.randint(0, 20, 10)
>> x
array([14, 13,  1, 15,  8,  0, 17, 11, 19, 13])
>> np.where(x > 10)
(array([0, 1, 3, 6, 7, 8, 9], dtype=int64),)
烟沫凡尘 2024-11-13 03:54:52

您可以创建一个带有可调用参数的函数,该参数将在列表理解的条件部分中使用。然后您可以使用 lambda 或其他函数对象来传递您的任意条件:

def indices(a, func):
    return [i for (i, val) in enumerate(a) if func(val)]

a = [1, 2, 3, 1, 2, 3, 1, 2, 3]

inds = indices(a, lambda x: x > 2)

>>> inds
[2, 5, 8]

它更接近您的 Matlab 示例,无需加载所有 numpy。

You can make a function that takes a callable parameter which will be used in the condition part of your list comprehension. Then you can use a lambda or other function object to pass your arbitrary condition:

def indices(a, func):
    return [i for (i, val) in enumerate(a) if func(val)]

a = [1, 2, 3, 1, 2, 3, 1, 2, 3]

inds = indices(a, lambda x: x > 2)

>>> inds
[2, 5, 8]

It's a little closer to your Matlab example, without having to load up all of numpy.

毁梦 2024-11-13 03:54:52

或者使用 numpy 的非零函数:

import numpy as np
a    = np.array([1,2,3,4,5])
inds = np.nonzero(a>2)
a[inds] 
array([3, 4, 5])

Or use numpy's nonzero function:

import numpy as np
a    = np.array([1,2,3,4,5])
inds = np.nonzero(a>2)
a[inds] 
array([3, 4, 5])
初见终念 2024-11-13 03:54:52

为什么不直接使用这个:

[i for i in range(len(a)) if a[i] > 2]

或者对于任意条件,为您的条件定义一个函数 f 并执行以下操作:

[i for i in range(len(a)) if f(a[i])]

Why not just use this:

[i for i in range(len(a)) if a[i] > 2]

or for arbitrary conditions, define a function f for your condition and do:

[i for i in range(len(a)) if f(a[i])]
放血 2024-11-13 03:54:52

此应用程序更常用的 numpy 例程是 numpy.where();不过,我相信它的工作原理与 numpy.nonzero 相同()

import numpy
a    = numpy.array([1,2,3,4,5])
inds = numpy.where(a>2)

要获取值,您可以存储索引并使用它们进行切片:

a[inds]

或者可以将数组作为可选参数传递:

numpy.where(a>2, a)

或多个数组:

b = numpy.array([11,22,33,44,55])
numpy.where(a>2, a, b)

The numpy routine more commonly used for this application is numpy.where(); though, I believe it works the same as numpy.nonzero().

import numpy
a    = numpy.array([1,2,3,4,5])
inds = numpy.where(a>2)

To get the values, you can either store the indices and slice withe them:

a[inds]

or you can pass the array as an optional parameter:

numpy.where(a>2, a)

or multiple arrays:

b = numpy.array([11,22,33,44,55])
numpy.where(a>2, a, b)
避讳 2024-11-13 03:54:52

要获取具有任意条件的值,您可以将 filter() 与 lambda 函数结合使用:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> filter(lambda x: x > 2, a)
[3, 3, 3]

获取索引的一种可能方法是使用 enumerate() 构建元组包含索引和值,然后过滤:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> aind = tuple(enumerate(a))
>>> print aind
((0, 1), (1, 2), (2, 3), (3, 1), (4, 2), (5, 3), (6, 1), (7, 2), (8, 3))
>>> filter(lambda x: x[1] > 2, aind)
((2, 3), (5, 3), (8, 3))

To get values with arbitrary conditions, you could use filter() with a lambda function:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> filter(lambda x: x > 2, a)
[3, 3, 3]

One possible way to get the indices would be to use enumerate() to build a tuple with both indices and values, and then filter that:

>>> a = [1,2,3,1,2,3,1,2,3]
>>> aind = tuple(enumerate(a))
>>> print aind
((0, 1), (1, 2), (2, 3), (3, 1), (4, 2), (5, 3), (6, 1), (7, 2), (8, 3))
>>> filter(lambda x: x[1] > 2, aind)
((2, 3), (5, 3), (8, 3))
凌乱心跳 2024-11-13 03:54:52

我一直在试图找出一种快速的方法来完成这件事,这就是我偶然发现的(使用 numpy 进行快速向量比较):

a_bool = numpy.array(a) > 2
inds = [i for (i, val) in enumerate(a_bool) if val]

事实证明,这比:

inds = [i for (i, val) in enumerate(a) if val > 2]

看起来 Python 更快在 numpy 数组中进行比较时,和/或在仅检查真相而不是比较时更快地进行列表理解。

编辑:

我正在重新审视我的代码,我发现了一种可能内存密集程度较低、速度更快且超级简洁的方法:

inds = np.arange( len(a) )[ a < 2 ]

I've been trying to figure out a fast way to do this exact thing, and here is what I stumbled upon (uses numpy for its fast vector comparison):

a_bool = numpy.array(a) > 2
inds = [i for (i, val) in enumerate(a_bool) if val]

It turns out that this is much faster than:

inds = [i for (i, val) in enumerate(a) if val > 2]

It seems that Python is faster at comparison when done in a numpy array, and/or faster at doing list comprehensions when just checking truth rather than comparison.

Edit:

I was revisiting my code and I came across a possibly less memory intensive, bit faster, and super-concise way of doing this in one line:

inds = np.arange( len(a) )[ a < 2 ]
末骤雨初歇 2024-11-13 03:54:52

我想我可能找到了一种快速而简单的替代品。
顺便说一句,我觉得 np.where() 函数不太令人满意,从某种意义上说,它包含一行烦人的零元素。

import matplotlib.mlab as mlab
a = np.random.randn(1,5)
print a

>> [[ 1.36406736  1.45217257 -0.06896245  0.98429727 -0.59281957]]

idx = mlab.find(a<0)
print idx
type(idx)

>> [2 4]
>> np.ndarray

最好的,

I think I may have found one quick and simple substitute.
BTW I felt that the np.where() function not very satisfactory, in a sense that somehow it contains an annoying row of zero-element.

import matplotlib.mlab as mlab
a = np.random.randn(1,5)
print a

>> [[ 1.36406736  1.45217257 -0.06896245  0.98429727 -0.59281957]]

idx = mlab.find(a<0)
print idx
type(idx)

>> [2 4]
>> np.ndarray

Best,
Da

如何视而不见 2024-11-13 03:54:52

Matlab 的查找代码有两个参数。约翰的代码解释了第一个参数,但没有解释第二个参数。例如,如果您想知道索引中的哪个位置满足条件: Mtlab 的函数将是:

find(x>2,1)

使用 John 的代码,您所要做的就是在索引函数末尾添加 [x],其中 x 是索引您正在寻找的号码。

def indices(a, func):
    return [i for (i, val) in enumerate(a) if func(val)]

a = [1, 2, 3, 1, 2, 3, 1, 2, 3]

inds = indices(a, lambda x: x > 2)[0] #[0] being the 2nd matlab argument

返回>>> 2、第一个指标超过2。

Matlab's find code has two arguments. John's code accounts for the first argument but not the second. For instance, if you want to know where in the index the condition is satisfied: Mtlab's function would be:

find(x>2,1)

Using John's code, all you have to do is add a [x] at the end of the indices function, where x is the index number you're looking for.

def indices(a, func):
    return [i for (i, val) in enumerate(a) if func(val)]

a = [1, 2, 3, 1, 2, 3, 1, 2, 3]

inds = indices(a, lambda x: x > 2)[0] #[0] being the 2nd matlab argument

which returns >>> 2, the first index to exceed 2.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文