剪切 numpy 数组
我想“剪切”一个 numpy 数组。我不确定我是否正确使用了“剪切”一词;通过剪切,我的意思是:
将第一列移动 0 个位置
将第二列移动 1 位
将第三列移动 2 位
等等...
所以这个数组:
array([[11, 12, 13],
[17, 18, 19],
[35, 36, 37]])
会变成这个数组:
array([[11, 36, 19],
[17, 12, 37],
[35, 18, 13]])
或类似这个数组:
array([[11, 0, 0],
[17, 12, 0],
[35, 18, 13]])
取决于我们如何处理边缘。我对边缘行为不太挑剔。
这是我对执行此操作的函数的尝试:
import numpy
def shear(a, strength=1, shift_axis=0, increase_axis=1, edges='clip'):
strength = int(strength)
shift_axis = int(shift_axis)
increase_axis = int(increase_axis)
if shift_axis == increase_axis:
raise UserWarning("Shear can't shift in the direction it increases")
temp = numpy.zeros(a.shape, dtype=int)
indices = []
for d, num in enumerate(a.shape):
coords = numpy.arange(num)
shape = [1] * len(a.shape)
shape[d] = num
coords = coords.reshape(shape) + temp
indices.append(coords)
indices[shift_axis] -= strength * indices[increase_axis]
if edges == 'clip':
indices[shift_axis][indices[shift_axis] < 0] = -1
indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
res = a[indices]
res[indices[shift_axis] == -1] = 0
elif edges == 'roll':
indices[shift_axis] %= a.shape[shift_axis]
res = a[indices]
return res
if __name__ == '__main__':
a = numpy.random.random((3,4))
print a
print shear(a)
它似乎有效。如果没有请告诉我!
它也显得笨重和不优雅。我是否忽略了执行此操作的内置 numpy/scipy 函数?在 numpy 中是否有更干净/更好/更有效的方法来做到这一点?我是在重新发明轮子吗?
编辑:
如果这适用于 N 维数组,而不仅仅是 2D 情况,那就加分了。
该函数将位于循环的中心,我将在数据处理中重复多次,因此我怀疑它实际上值得优化。
第二次编辑: 我终于做了一些基准测试。看起来 numpy.roll 是可行的方法,尽管有循环。谢谢 Tom10 和 Sven Marnach!
基准测试代码:(在Windows上运行,我认为不要在Linux上使用time.clock)
import time, numpy
def shear_1(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
strength = int(strength)
shift_axis = int(shift_axis)
increase_axis = int(increase_axis)
if shift_axis == increase_axis:
raise UserWarning("Shear can't shift in the direction it increases")
temp = numpy.zeros(a.shape, dtype=int)
indices = []
for d, num in enumerate(a.shape):
coords = numpy.arange(num)
shape = [1] * len(a.shape)
shape[d] = num
coords = coords.reshape(shape) + temp
indices.append(coords)
indices[shift_axis] -= strength * indices[increase_axis]
if edges == 'clip':
indices[shift_axis][indices[shift_axis] < 0] = -1
indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
res = a[indices]
res[indices[shift_axis] == -1] = 0
elif edges == 'roll':
indices[shift_axis] %= a.shape[shift_axis]
res = a[indices]
return res
def shear_2(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
indices = numpy.indices(a.shape)
indices[shift_axis] -= strength * indices[increase_axis]
indices[shift_axis] %= a.shape[shift_axis]
res = a[tuple(indices)]
if edges == 'clip':
res[indices[shift_axis] < 0] = 0
res[indices[shift_axis] >= a.shape[shift_axis]] = 0
return res
def shear_3(a, strength=1, shift_axis=0, increase_axis=1):
if shift_axis > increase_axis:
shift_axis -= 1
res = numpy.empty_like(a)
index = numpy.index_exp[:] * increase_axis
roll = numpy.roll
for i in range(0, a.shape[increase_axis]):
index_i = index + (i,)
res[index_i] = roll(a[index_i], i * strength, shift_axis)
return res
numpy.random.seed(0)
for a in (
numpy.random.random((3, 3, 3, 3)),
numpy.random.random((50, 50, 50, 50)),
numpy.random.random((300, 300, 10, 10)),
):
print 'Array dimensions:', a.shape
for sa, ia in ((0, 1), (1, 0), (2, 3), (0, 3)):
print 'Shift axis:', sa
print 'Increase axis:', ia
ref = shear_1(a, shift_axis=sa, increase_axis=ia)
for shear, label in ((shear_1, '1'), (shear_2, '2'), (shear_3, '3')):
start = time.clock()
b = shear(a, shift_axis=sa, increase_axis=ia)
end = time.clock()
print label + ': %0.6f seconds'%(end-start)
if (b - ref).max() > 1e-9:
print "Something's wrong."
print
I'd like to 'shear' a numpy array. I'm not sure I'm using the term 'shear' correctly; by shear, I mean something like:
Shift the first column by 0 places
Shift the second column by 1 place
Shift the third colum by 2 places
etc...
So this array:
array([[11, 12, 13],
[17, 18, 19],
[35, 36, 37]])
would turn into either this array:
array([[11, 36, 19],
[17, 12, 37],
[35, 18, 13]])
or something like this array:
array([[11, 0, 0],
[17, 12, 0],
[35, 18, 13]])
depending on how we handle the edges. I'm not too particular about edge behavior.
Here's my attempt at a function that does this:
import numpy
def shear(a, strength=1, shift_axis=0, increase_axis=1, edges='clip'):
strength = int(strength)
shift_axis = int(shift_axis)
increase_axis = int(increase_axis)
if shift_axis == increase_axis:
raise UserWarning("Shear can't shift in the direction it increases")
temp = numpy.zeros(a.shape, dtype=int)
indices = []
for d, num in enumerate(a.shape):
coords = numpy.arange(num)
shape = [1] * len(a.shape)
shape[d] = num
coords = coords.reshape(shape) + temp
indices.append(coords)
indices[shift_axis] -= strength * indices[increase_axis]
if edges == 'clip':
indices[shift_axis][indices[shift_axis] < 0] = -1
indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
res = a[indices]
res[indices[shift_axis] == -1] = 0
elif edges == 'roll':
indices[shift_axis] %= a.shape[shift_axis]
res = a[indices]
return res
if __name__ == '__main__':
a = numpy.random.random((3,4))
print a
print shear(a)
It seems to work. Please tell me if it doesn't!
It also seems clunky and inelegant. Am I overlooking a builtin numpy/scipy function that does this? Is there a cleaner/better/more efficient way to do this in numpy? Am I reinventing the wheel?
EDIT:
Bonus points if this works on an N-dimensional array, instead of just the 2D case.
This function will be at the very center of a loop I'll repeat many times in our data processing, so I suspect it's actually worth optimizing.
SECOND EDIT:
I finally did some benchmarking. It looks like numpy.roll is the way to go, despite the loop. Thanks, tom10 and Sven Marnach!
Benchmarking code: (run on Windows, don't use time.clock on Linux I think)
import time, numpy
def shear_1(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
strength = int(strength)
shift_axis = int(shift_axis)
increase_axis = int(increase_axis)
if shift_axis == increase_axis:
raise UserWarning("Shear can't shift in the direction it increases")
temp = numpy.zeros(a.shape, dtype=int)
indices = []
for d, num in enumerate(a.shape):
coords = numpy.arange(num)
shape = [1] * len(a.shape)
shape[d] = num
coords = coords.reshape(shape) + temp
indices.append(coords)
indices[shift_axis] -= strength * indices[increase_axis]
if edges == 'clip':
indices[shift_axis][indices[shift_axis] < 0] = -1
indices[shift_axis][indices[shift_axis] >= a.shape[shift_axis]] = -1
res = a[indices]
res[indices[shift_axis] == -1] = 0
elif edges == 'roll':
indices[shift_axis] %= a.shape[shift_axis]
res = a[indices]
return res
def shear_2(a, strength=1, shift_axis=0, increase_axis=1, edges='roll'):
indices = numpy.indices(a.shape)
indices[shift_axis] -= strength * indices[increase_axis]
indices[shift_axis] %= a.shape[shift_axis]
res = a[tuple(indices)]
if edges == 'clip':
res[indices[shift_axis] < 0] = 0
res[indices[shift_axis] >= a.shape[shift_axis]] = 0
return res
def shear_3(a, strength=1, shift_axis=0, increase_axis=1):
if shift_axis > increase_axis:
shift_axis -= 1
res = numpy.empty_like(a)
index = numpy.index_exp[:] * increase_axis
roll = numpy.roll
for i in range(0, a.shape[increase_axis]):
index_i = index + (i,)
res[index_i] = roll(a[index_i], i * strength, shift_axis)
return res
numpy.random.seed(0)
for a in (
numpy.random.random((3, 3, 3, 3)),
numpy.random.random((50, 50, 50, 50)),
numpy.random.random((300, 300, 10, 10)),
):
print 'Array dimensions:', a.shape
for sa, ia in ((0, 1), (1, 0), (2, 3), (0, 3)):
print 'Shift axis:', sa
print 'Increase axis:', ia
ref = shear_1(a, shift_axis=sa, increase_axis=ia)
for shear, label in ((shear_1, '1'), (shear_2, '2'), (shear_3, '3')):
start = time.clock()
b = shear(a, shift_axis=sa, increase_axis=ia)
end = time.clock()
print label + ': %0.6f seconds'%(end-start)
if (b - ref).max() > 1e-9:
print "Something's wrong."
print
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
numpy roll 可以做到这一点。例如,如果原始数组是 x,则
生成
numpy roll does this. For example, if you original array is x then
produces
tom10 的答案中的方法可以扩展到任意维度:
The approach in tom10's answer can be extended to arbitrary dimensions:
这可以使用这个答案中描述的技巧来完成Joe Kington:
要获得“clip”而不是“roll”,请使用
这可能是最有效的方法,因为它根本不使用任何 Python 循环。
This can be done using a trick described in this answer by Joe Kington:
To get "clip" instead of "roll", use
This is probably the most efficient way of doing it, since it does not use any Python loop at all.
这是您自己的方法的清理版本:
主要区别在于它使用 numpy.indices() ,而不是滚动您自己的版本。
Here is a cleaned-up version of your own approach:
The main difference is that it uses
numpy.indices()
instead of rolling your own version of this.我认为。您可能应该更多地考虑这个伪代码而不是实际的 Python。基本上转置数组,在其上映射通用旋转函数以进行旋转,然后将其转置回来。
I think. You should probably consider this psuedocode more than actual Python. Basically transpose the array, map a general rotate function over it to do the rotation, then transpose it back.