连接 Numpy 数组而不进行复制
在 Numpy 中,我可以使用 np.append 或 np.concatenate 端到端连接两个数组:
>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> Z = np.append(X, Y, axis=0)
>>> Z
array([[ 1, 2, 3],
[-1, -2, -3],
[ 4, 5, 6]])
但是这些会复制其输入数组:
>>> Z[0,:] = 0
>>> Z
array([[ 0, 0, 0],
[-1, -2, -3],
[ 4, 5, 6]])
>>> X
array([[1, 2, 3]])
有没有办法将两个数组连接到一个视图中,即不复制?这需要 np.ndarray 子类吗?
In Numpy, I can concatenate two arrays end-to-end with np.append
or np.concatenate
:
>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> Z = np.append(X, Y, axis=0)
>>> Z
array([[ 1, 2, 3],
[-1, -2, -3],
[ 4, 5, 6]])
But these make copies of their input arrays:
>>> Z[0,:] = 0
>>> Z
array([[ 0, 0, 0],
[-1, -2, -3],
[ 4, 5, 6]])
>>> X
array([[1, 2, 3]])
Is there a way to concatenate two arrays into a view, i.e. without copying? Would that require an np.ndarray
subclass?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
属于 Numpy 数组的内存必须是连续的。如果单独分配数组,它们会随机分散在内存中,并且无法将它们表示为视图 Numpy 数组。
如果您事先知道需要多少个数组,则可以从预先分配的一个大数组开始,并让每个小数组成为大数组的视图(例如通过切片获得)。
The memory belonging to a Numpy array must be contiguous. If you allocated the arrays separately, they are randomly scattered in memory, and there is no way to represent them as a view Numpy array.
If you know beforehand how many arrays you need, you can instead start with one big array that you allocate beforehand, and have each of the small arrays be a view to the big array (e.g. obtained by slicing).
只需在用数据填充数组之前初始化数组即可。如果你愿意,你可以分配比需要更多的空间,并且由于 numpy 的工作方式,它不会占用更多的 RAM。
仅当数据放入数组后才会使用内存。在任何大小的数据集上,通过连接两个数组创建一个新数组永远不会完成,即数据集> 1GB左右。
Just initialize the array before you fill it with data. If you want you can allocate more space than needed and it will not take up more RAM because of the way numpy works.
The memory is used only once data is put into the array. Creating a new array from concatenating two will never finish on a dataset of any size, i.e. dataset > 1GB or so.
我遇到了同样的问题,最终把它颠倒过来,在正常连接(带副本)后,我重新分配原始数组以成为连接数组的视图:
您可以按如下方式测试它:
I had the same problem and ended up doing it reversed, after concatenating normally (with copy) I reassigned the original arrays to become views on the concatenated one:
You can test it as follows:
一点也不优雅,但是您可以使用元组存储指向数组的指针来接近您想要的效果。现在我不知道如何在这种情况下使用它,但我以前做过类似的事情。
Not really elegant at all but you can get close to what you want using a tuple to store pointers to the arrays. Now I have no idea how I would use it in the case but I have done things like this before.
答案基于我的其他答案 Reference to ndarray rows in ndarray< /a>
The answer is based on my other answer in Reference to ndarray rows in ndarray
您可以创建一个数组的数组,例如:
问题是它在广播操作上创建副本(听起来像一个错误)。
You may create an array of arrays, like:
The problem is that it creates copies on broadcast operations (sounds like a bug).