就地改变 numpy 函数输出数组
我正在尝试编写一个对数组执行数学运算并返回结果的函数。一个简化的示例可能是:
def original_func(A):
return A[1:] + A[:-1]
为了加速并避免为每个函数调用分配新的输出数组,我希望将输出数组作为参数,并就地更改它:
def inplace_func(A, out):
out[:] = A[1:] + A[:-1]
但是,当在按照这种方式,
A = numpy.random.rand(1000,1000)
out = numpy.empty((999,1000))
C = original_func(A)
inplace_func(A, out)
原始函数似乎比就地函数快两倍。这该如何解释呢?就地函数不是应该更快吗,因为它不需要分配内存?
I'm trying to write a function that performs a mathematical operation on an array and returns the result. A simplified example could be:
def original_func(A):
return A[1:] + A[:-1]
For speed-up and to avoid allocating a new output array for each function call, I would like to have the output array as an argument, and alter it in place:
def inplace_func(A, out):
out[:] = A[1:] + A[:-1]
However, when calling these two functions in the following manner,
A = numpy.random.rand(1000,1000)
out = numpy.empty((999,1000))
C = original_func(A)
inplace_func(A, out)
the original function seems to be twice as fast as the in-place function. How can this be explained? Shouldn't the in-place function be quicker since it doesn't have to allocate memory?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您想就地执行操作,请执行
此操作,这不会创建任何临时对象(
A[1:] + A[:-1]
会创建任何临时对象)。所有 Numpy 二元运算都有相应的函数,请在此处查看列表: http: //docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs
If you want to perform the operation in-place, do
This does not create any temporaries (which
A[1:] + A[:-1]
) does.All Numpy binary operations have corresponding functions, check the list here: http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs
我认为答案如下:
在这两种情况下,您都会计算
A[1:] + A[:-1]
,并且在这两种情况下,您实际上都会创建中间矩阵。然而,在第二种情况下,您显式将整个新分配的大数组复制到保留的内存中。复制这样的数组大约需要与原始操作相同的时间,因此实际上时间增加了一倍。
总而言之,在第一种情况下,您会这样做:
在第二种情况下,您会这样做
I think that the answer is the following:
In both cases, you compute
A[1:] + A[:-1]
, and in both cases, you actually create an intermediate matrix.What happens in the second case, though, is that you explicitly copy the whole big newly allocated array into a reserved memory. Copying such an array takes about the same time as the original operation, so you in fact double the time.
To sum-up, in the first case, you do:
In the second case, you do
我同意奥利弗的解释。如果您想就地执行操作,则必须手动循环数组。这会慢得多,但如果您需要速度,您可以求助于 Cython,它为您提供纯 C 实现的速度。
I agree with Olivers explanation. If you want to perform the operation inplace, you have to loop over your array manually. This will be much slower, but if you need speed you can resort to Cython which gives you the speed of a pure C implementation.