连接 Numpy 数组而不进行复制

发布于 2024-12-11 06:13:49 字数 537 浏览 0 评论 0原文

在 Numpy 中，我可以使用 np.append 或 np.concatenate 端到端连接两个数组：

>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> Z = np.append(X, Y, axis=0)
>>> Z
array([[ 1,  2,  3],
       [-1, -2, -3],
       [ 4,  5,  6]])

但是这些会复制其输入数组：

>>> Z[0,:] = 0
>>> Z
array([[ 0,  0,  0],
       [-1, -2, -3],
       [ 4,  5,  6]])
>>> X
array([[1, 2, 3]])

有没有办法将两个数组连接到一个视图中，即不复制？这需要 np.ndarray 子类吗？

原文

In Numpy, I can concatenate two arrays end-to-end with np.append or np.concatenate:

>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> Z = np.append(X, Y, axis=0)
>>> Z
array([[ 1,  2,  3],
       [-1, -2, -3],
       [ 4,  5,  6]])

But these make copies of their input arrays:

>>> Z[0,:] = 0
>>> Z
array([[ 0,  0,  0],
       [-1, -2, -3],
       [ 4,  5,  6]])
>>> X
array([[1, 2, 3]])

Is there a way to concatenate two arrays into a view, i.e. without copying? Would that require an np.ndarray subclass?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一抹淡然 2024-12-18 06:13:49

属于 Numpy 数组的内存必须是连续的。如果单独分配数组，它们会随机分散在内存中，并且无法将它们表示为视图 Numpy 数组。

如果您事先知道需要多少个数组，则可以从预先分配的一个大数组开始，并让每个小数组成为大数组的视图（例如通过切片获得）。

回复收藏 0 原文

逆流 2024-12-18 06:13:49

只需在用数据填充数组之前初始化数组即可。如果你愿意，你可以分配比需要更多的空间，并且由于 numpy 的工作方式，它不会占用更多的 RAM。

A = np.zeros(R,C)
A[row] = [data]

仅当数据放入数组后才会使用内存。在任何大小的数据集上，通过连接两个数组创建一个新数组永远不会完成，即数据集> 1GB左右。

Just initialize the array before you fill it with data. If you want you can allocate more space than needed and it will not take up more RAM because of the way numpy works.

A = np.zeros(R,C)
A[row] = [data]

The memory is used only once data is put into the array. Creating a new array from concatenating two will never finish on a dataset of any size, i.e. dataset > 1GB or so.

回复收藏 0 原文

赤濁 2024-12-18 06:13:49

我遇到了同样的问题，最终把它颠倒过来，在正常连接（带副本）后，我重新分配原始数组以成为连接数组的视图：

import numpy as np

def concat_no_copy(arrays):
    """ Concats the arrays and returns the concatenated array 
    in addition to the original arrays as views of the concatenated one.

    Parameters:
    -----------
    arrays: list
        the list of arrays to concatenate
    """
    con = np.concatenate(arrays)

    viewarrays = []
    for i, arr in enumerate(arrays):
        arrnew = con[sum(len(a) for a in arrays[:i]):
                     sum(len(a) for a in arrays[:i + 1])]
        viewarrays.append(arrnew)
        assert all(arr == arrnew)

    # return the view arrays, replace the old ones with these
    return con, viewarrays

您可以按如下方式测试它：

def test_concat_no_copy():
    arr1 = np.array([0, 1, 2, 3, 4])
    arr2 = np.array([5, 6, 7, 8, 9])
    arr3 = np.array([10, 11, 12, 13, 14])

    arraylist = [arr1, arr2, arr3]

    con, newarraylist = concat_no_copy(arraylist)

    assert all(con == np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
                                11, 12, 13, 14]))

    for old, new in zip(arraylist, newarraylist):
        assert all(old == new)

I had the same problem and ended up doing it reversed, after concatenating normally (with copy) I reassigned the original arrays to become views on the concatenated one:

import numpy as np

def concat_no_copy(arrays):
    """ Concats the arrays and returns the concatenated array 
    in addition to the original arrays as views of the concatenated one.

    Parameters:
    -----------
    arrays: list
        the list of arrays to concatenate
    """
    con = np.concatenate(arrays)

    viewarrays = []
    for i, arr in enumerate(arrays):
        arrnew = con[sum(len(a) for a in arrays[:i]):
                     sum(len(a) for a in arrays[:i + 1])]
        viewarrays.append(arrnew)
        assert all(arr == arrnew)

    # return the view arrays, replace the old ones with these
    return con, viewarrays

You can test it as follows:

def test_concat_no_copy():
    arr1 = np.array([0, 1, 2, 3, 4])
    arr2 = np.array([5, 6, 7, 8, 9])
    arr3 = np.array([10, 11, 12, 13, 14])

    arraylist = [arr1, arr2, arr3]

    con, newarraylist = concat_no_copy(arraylist)

    assert all(con == np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
                                11, 12, 13, 14]))

    for old, new in zip(arraylist, newarraylist):
        assert all(old == new)

回复收藏 0 原文

荒芜了季节 2024-12-18 06:13:49

一点也不优雅，但是您可以使用元组存储指向数组的指针来接近您想要的效果。现在我不知道如何在这种情况下使用它，但我以前做过类似的事情。

>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> z = (X, Y)
>>> z[0][:] = 0
>>> z
(array([[0, 0, 0]]), array([[-1, -2, -3],
       [ 4,  5,  6]]))
>>> X
array([[0, 0, 0]])

Not really elegant at all but you can get close to what you want using a tuple to store pointers to the arrays. Now I have no idea how I would use it in the case but I have done things like this before.

>>> X = np.array([[1,2,3]])
>>> Y = np.array([[-1,-2,-3],[4,5,6]])
>>> z = (X, Y)
>>> z[0][:] = 0
>>> z
(array([[0, 0, 0]]), array([[-1, -2, -3],
       [ 4,  5,  6]]))
>>> X
array([[0, 0, 0]])

回复收藏 0 原文

森林很绿却致人迷途 2024-12-18 06:13:49

答案基于我的其他答案 Reference to ndarray rows in ndarray< /a>

X = np.array([[1,2,3]])
Y = np.array([[-1,-2,-3],[4,5,6]])
Z = np.array([None, None, None])
Z[0] = X[0]
Z[1] = Y[0]
Z[2] = Y[1]

Z[0][0] = 5 # X would be changed as well

print(X)
Output: 
array([[5, 2, 3]])

# Let's make it a function!
def concat(X, Y, copy=True):
    """Return an array of references if copy=False""" 
    if copy is True:  # deep copy
        return np.append(X, Y, axis=0)
    len_x, len_y = len(X), len(Y)
    ret = np.array([None for _ in range(len_x + len_y)])
    for i in range(len_x):
        ret[i] = X[i]
    for j in range(len_y):
        ret[len_x + j] = Y[j] 
    return ret

The answer is based on my other answer in Reference to ndarray rows in ndarray

X = np.array([[1,2,3]])
Y = np.array([[-1,-2,-3],[4,5,6]])
Z = np.array([None, None, None])
Z[0] = X[0]
Z[1] = Y[0]
Z[2] = Y[1]

Z[0][0] = 5 # X would be changed as well

print(X)
Output: 
array([[5, 2, 3]])

# Let's make it a function!
def concat(X, Y, copy=True):
    """Return an array of references if copy=False""" 
    if copy is True:  # deep copy
        return np.append(X, Y, axis=0)
    len_x, len_y = len(X), len(Y)
    ret = np.array([None for _ in range(len_x + len_y)])
    for i in range(len_x):
        ret[i] = X[i]
    for j in range(len_y):
        ret[len_x + j] = Y[j] 
    return ret

回复收藏 0 原文

孤千羽 2024-12-18 06:13:49

您可以创建一个数组的数组，例如：

>>> from numpy import *
>>> a = array([1.0, 2.0, 3.0])
>>> b = array([4.0, 5.0])
>>> c = array([a, b])
>>> c
array([[ 1.  2.  3.], [ 4.  5.]], dtype=object)
>>> a[0] = 100.0
>>> a
array([ 100.,    2.,    3.])
>>> c
array([[ 100.    2.    3.], [ 4.  5.]], dtype=object)
>>> c[0][1] = 200.0
>>> a
array([ 100.,  200.,    3.])
>>> c
array([[ 100.  200.    3.], [ 4.  5.]], dtype=object)
>>> c *= 1000
>>> c
array([[ 100000.  200000.    3000.], [ 4000.  5000.]], dtype=object)
>>> a
array([ 100.,  200.,    3.])
>>> # Oops! Copies were made...

问题是它在广播操作上创建副本（听起来像一个错误）。

You may create an array of arrays, like:

>>> from numpy import *
>>> a = array([1.0, 2.0, 3.0])
>>> b = array([4.0, 5.0])
>>> c = array([a, b])
>>> c
array([[ 1.  2.  3.], [ 4.  5.]], dtype=object)
>>> a[0] = 100.0
>>> a
array([ 100.,    2.,    3.])
>>> c
array([[ 100.    2.    3.], [ 4.  5.]], dtype=object)
>>> c[0][1] = 200.0
>>> a
array([ 100.,  200.,    3.])
>>> c
array([[ 100.  200.    3.], [ 4.  5.]], dtype=object)
>>> c *= 1000
>>> c
array([[ 100000.  200000.    3000.], [ 4000.  5000.]], dtype=object)
>>> a
array([ 100.,  200.,    3.])
>>> # Oops! Copies were made...

The problem is that it creates copies on broadcast operations (sounds like a bug).

回复收藏 0 原文

~没有更多了~