在返回向量的函数上使用 Numpy Vectorize

发布于 2024-09-12 11:01:12 字数 1017 浏览 4 评论 0原文

numpy.vectorize 接受函数 f:a->b 并将其转换为 g:a[]->b[]。

ab 是标量时,这可以正常工作,但我想不出为什么它不能将 b 作为 ndarray 使用的原因> 或列表,即 f:a->b[] 和 g:a[]->b[][]

例如:

import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))

这会产生:

array([[ 0.  0.  0.  0.  0.],
       [ 1.  1.  1.  1.  1.],
       [ 2.  2.  2.  2.  2.],
       [ 3.  3.  3.  3.  3.]], dtype=object)

好的,这样会给出正确的值,但给出错误的数据类型。更糟糕的是:

g(a).shape

产量:

(4,)

所以这个数组几乎没有用。我知道我可以将它转换为:

np.array(map(list, a), dtype=np.float32)

给我我想要的东西:

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.]], dtype=float32)

但这既不高效也不Pythonic。你们中有人能找到一种更干净的方法来做到这一点吗?

numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].

This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]

For example:

import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))

This yields:

array([[ 0.  0.  0.  0.  0.],
       [ 1.  1.  1.  1.  1.],
       [ 2.  2.  2.  2.  2.],
       [ 3.  3.  3.  3.  3.]], dtype=object)

Ok, so that gives the right values, but the wrong dtype. And even worse:

g(a).shape

yields:

(4,)

So this array is pretty much useless. I know I can convert it doing:

np.array(map(list, a), dtype=np.float32)

to give me what I want:

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.]], dtype=float32)

but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

金橙橙 2024-09-19 11:01:12

np.vectorize 只是一个方便的函数。它实际上并没有使代码运行得更快。如果使用np.vectorize不方便,只需编写您自己的函数即可按照您的意愿工作。

np.vectorize 的目的是将不支持 numpy 的函数(例如,将浮点数作为输入并返回浮点数作为输出)转换为可以操作(并返回)numpy 数组的函数。

您的函数 f 已经支持 numpy - 它在定义中使用 numpy 数组并返回一个 numpy 数组。因此 np.vectorize 不太适合您的用例。

因此,解决方案就是推出您自己的函数 f ,使其按照您想要的方式工作。

np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.

The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.

Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.

The solution therefore is just to roll your own function f that works the way you desire.

半岛未凉 2024-09-19 11:01:12

1.12.0 中的新参数 signature 完全符合您的要求。

def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)

g = np.vectorize(f, signature='()->(n)')

然后 g(np.arange(4)).shape 将给出 (4L, 5L)

这里指定了f的签名。 (n) 是返回值的形状,() 是标量参数的形状。并且参数也可以是数组。对于更复杂的签名,请参阅 广义通用函数API

A new parameter signature in 1.12.0 does exactly what you what.

def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)

g = np.vectorize(f, signature='()->(n)')

Then g(np.arange(4)).shape will give (4L, 5L).

Here the signature of f is specified. The (n) is the shape of the return value, and the () is the shape of the parameter which is scalar. And the parameters can be arrays too. For more complex signatures, see Generalized Universal Function API.

青芜 2024-09-19 11:01:12
import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)

这应该可以解决问题,并且无论您的输入大小如何,它都会起作用。 “地图”仅适用于一维输入。使用“.tolist()”并创建一个新的 ndarray 可以更完整、更好地解决问题(我相信)。希望这有帮助。

import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)

This should fix the problem and it will work regardless of what size your input is. "map" only works for one dimentional inputs. Using ".tolist()" and creating a new ndarray solves the problem more completely and nicely(I believe). Hope this helps.

并安 2024-09-19 11:01:12

您想要对函数进行矢量化

import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)

假设您想要获得单个np.float32数组作为结果,则必须将其指定为otype。然而,在您的问题中,您指定了otypes=[np.ndarray],这意味着您希望每个元素都是np.ndarray。因此,您正确地得到了dtype=object的结果。

正确的调用是

np.vectorize(f, signature='()->(n)', otypes=[np.float32])

对于这样一个简单的函数,最好利用 numpy 的 ufunctions; np.vectorize 只是循环遍历它。因此,在您的情况下,只需重写您的函数即可,这样

def f(x):
    return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))

速度更快,产生的错误也更少(但请注意,如果您传递复数或四边形,结果dtype将取决于x精度数,结果也是如此)。

You want to vectorize the function

import numpy as np
def f(x):
    return x * np.array([1,1,1,1,1], dtype=np.float32)

Assuming that you want to get single np.float32 arrays as result, you have to specify this as otype. In your question you specified however otypes=[np.ndarray] which means you want every element to be an np.ndarray. Thus, you correctly get a result of dtype=object.

The correct call would be

np.vectorize(f, signature='()->(n)', otypes=[np.float32])

For such a simple function it is however better to leverage numpy's ufunctions; np.vectorize just loops over it. So in your case just rewrite your function as

def f(x):
    return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))

This is faster and produces less obscure errors (note however, that the results dtype will depend on x if you pass a complex or quad precision number, so will be the result).

尬尬 2024-09-19 11:01:12

我写了一个函数,看起来很适合你的需要。

def amap(func, *args):
    '''array version of build-in map
    amap(function, sequence[, sequence, ...]) -> array
    Examples
    --------
    >>> amap(lambda x: x**2, 1)
    array(1)
    >>> amap(lambda x: x**2, [1, 2])
    array([1, 4])
    >>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
    array([2, 5])
    >>> amap(lambda x: (x, x), 1)
    array([1, 1])
    >>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
    array([[1, 9], [4, 16]])
    '''
    args = np.broadcast(None, *args)
    res = np.array([func(*arg[1:]) for arg in args])
    shape = args.shape + res.shape[1:]
    return res.reshape(shape)

让我们尝试

def f(x):
        return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))

输出

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.]], dtype=float32)

为了方便起见,您也可以用 lambda 或partial 包装它

g = lambda x:amap(f, x)
g(np.arange(4))

请注意 vectorize 的文档字符串说

提供vectorize函数主要是为了方便,而不是为了
表现。该实现本质上是一个 for 循环。

因此,我们期望此处的 amap 具有与 vectorize 类似的性能。我没有检查,欢迎任何性能测试。

如果性能确实很重要,您应该考虑其他方法,例如使用 reshapebroadcast 进行直接数组计算,以避免纯 python 中的循环(都是 vectorize > 和 amap 是后一种情况)。

I've written a function, it seems fits to your need.

def amap(func, *args):
    '''array version of build-in map
    amap(function, sequence[, sequence, ...]) -> array
    Examples
    --------
    >>> amap(lambda x: x**2, 1)
    array(1)
    >>> amap(lambda x: x**2, [1, 2])
    array([1, 4])
    >>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
    array([2, 5])
    >>> amap(lambda x: (x, x), 1)
    array([1, 1])
    >>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
    array([[1, 9], [4, 16]])
    '''
    args = np.broadcast(None, *args)
    res = np.array([func(*arg[1:]) for arg in args])
    shape = args.shape + res.shape[1:]
    return res.reshape(shape)

Let try

def f(x):
        return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))

Outputs

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.]], dtype=float32)

You may also wrap it with lambda or partial for convenience

g = lambda x:amap(f, x)
g(np.arange(4))

Note the docstring of vectorize says

The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.

Thus we would expect the amap here have similar performance as vectorize. I didn't check it, Any performance test are welcome.

If the performance is really important, you should consider something else, e.g. direct array calculation with reshape and broadcast to avoid loop in pure python (both vectorize and amap are the later case).

酷炫老祖宗 2024-09-19 11:01:12

解决这个问题的最佳方法是使用二维 NumPy 数组(在本例中为列数组)作为原始函数的输入,然后该函数将生成一个二维输出,其中结果我相信你已经预料到了。

代码如下:

import numpy as np
def f(x):
    return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)

a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)

这是一种更简单、更不易出错的完成操作的方法。此方法不是尝试使用 numpy.vectorize 转换函数,而是依赖于 NumPy 广播数组的天然能力。诀窍是确保数组之间至少一个维度的长度相等。

The best way to solve this would be to use a 2-D NumPy array (in this case a column array) as an input to the original function, which will then generate a 2-D output with the results I believe you were expecting.

Here is what it might look like in code:

import numpy as np
def f(x):
    return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)

a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)

This is a much simpler and less error prone way to complete the operation. Rather than trying to transform the function with numpy.vectorize, this method relies on NumPy's natural ability to broadcast arrays. The trick is to make sure that at least one dimension has an equal length between the arrays.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文