子类化 numpy ndarray 问题

发布于 2024-10-19 09:48:58 字数 710 浏览 10 评论 0原文

我想对 numpy ndarray 进行子类化。但是，我无法更改数组。为什么 self = ... 不改变数组？谢谢。

import numpy as np

class Data(np.ndarray):

    def __new__(cls, inputarr):
        obj = np.asarray(inputarr).view(cls)
        return obj

    def remove_some(self, t):
        test_cols, test_vals = zip(*t)
        test_cols = self[list(test_cols)]
        test_vals = np.array(test_vals, test_cols.dtype)

        self = self[test_cols != test_vals] # Is this part correct?

        print len(self) # correct result

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])
d = Data(z)
d.remove_some([('a',4)])

print len(d)  # output the same size as original. Why?

原文

I would like to subclass numpy ndarray. However, I cannot change the array. Why self = ... does not change the array? Thanks.

import numpy as np

class Data(np.ndarray):

    def __new__(cls, inputarr):
        obj = np.asarray(inputarr).view(cls)
        return obj

    def remove_some(self, t):
        test_cols, test_vals = zip(*t)
        test_cols = self[list(test_cols)]
        test_vals = np.array(test_vals, test_cols.dtype)

        self = self[test_cols != test_vals] # Is this part correct?

        print len(self) # correct result

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])
d = Data(z)
d.remove_some([('a',4)])

print len(d)  # output the same size as original. Why?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜警司 2024-10-26 09:48:58

您没有得到预期结果的原因是您在方法 remove_some 中重新分配了 self。您只是创建一个新的局部变量self。如果你的数组形状不改变，你可以简单地做 self[:] = ... 并且你可以保留对 self 的引用，一切都会好起来的，但你正在尝试改变形状自我。这意味着我们需要重新分配一些新的内存并更改我们引用 self 时指向的位置。

我不知道该怎么做。我认为可以通过__array_finalize__或__array__或__array_wrap__来实现。但我所尝试的一切都不尽如人意。

现在，有另一种方法可以解决这个问题，它不需要子类化 ndarray 。您可以创建一个新类，保留一个 ndarray 属性，然后覆盖所有常用的 __add__、__mul__ 等。像这样：

Class Data(object):
    def __init__(self, inarr):
        self._array = np.array(inarr)
    def remove_some(x):
        self._array = self._array[x]
    def __add__(self, other):
        return np.add(self._array, other)

嗯，您得到了图片。重写所有操作符是很痛苦的，但从长远来看，我认为更灵活。

您必须仔细阅读本文才能正确执行此操作。有像 __array_finalize__ 之类的方法需要在正确的时间调用来进行“清理”。

The reason you are not getting the result you expect is because you are re-assigning self within the method remove_some. You are just creating a new local variable self. If your array shape were not to change, you could simply do self[:] = ... and you could keep the reference to self and all would be well, but you are trying to change the shape of self. Which means we need to re-allocate some new memory and change where we point when we refer to self.

I don't know how to do this. I thought it could be achieved by __array_finalize__ or __array__ or __array_wrap__. But everything I've tried is falling short.

Now, there's another way to go about this that doesn't subclass ndarray. You can make a new class that keeps an attribute that is an ndarray and then override all the usual __add__, __mul__, etc.. Something like this:

Class Data(object):
    def __init__(self, inarr):
        self._array = np.array(inarr)
    def remove_some(x):
        self._array = self._array[x]
    def __add__(self, other):
        return np.add(self._array, other)

Well, you get the picture. It's a pain to override all the operators, but in the long run, I think more flexible.

You'll have to read this thoroughly to do it right. There are methods like __array_finalize__ that need to be called a the right time to do "cleanup".

回复收藏 0 原文

小清晰的声音 2024-10-26 09:48:58

也许将其设为函数，而不是方法：

import numpy as np

def remove_row(arr,col,val):
    return arr[arr[col]!=val]

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])

z=remove_row(z,'a',4)
print(repr(z))

# array([(1, 2, 3), (7, 8, 9)], 
#       dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<i4')])

或者，如果您希望将其作为方法，

import numpy as np

class Data(np.ndarray):

    def __new__(cls, inputarr):
        obj = np.asarray(inputarr).view(cls)
        return obj

    def remove_some(self, col, val):
        return self[self[col] != val]

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])
d = Data(z)
d = d.remove_some('a', 4)
print(d)

这里的关键区别是 remove_some 不会尝试修改 self，它仅返回 Data 的新实例。

Perhaps make this a function, rather than a method:

import numpy as np

def remove_row(arr,col,val):
    return arr[arr[col]!=val]

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])

z=remove_row(z,'a',4)
print(repr(z))

# array([(1, 2, 3), (7, 8, 9)], 
#       dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<i4')])

Or, if you want it as a method,

import numpy as np

class Data(np.ndarray):

    def __new__(cls, inputarr):
        obj = np.asarray(inputarr).view(cls)
        return obj

    def remove_some(self, col, val):
        return self[self[col] != val]

z = np.array([(1,2,3), (4,5,6), (7,8,9)],
    dtype=[('a', int), ('b', int), ('c', int)])
d = Data(z)
d = d.remove_some('a', 4)
print(d)

The key difference here is that remove_some does not try to modify self, it merely returns a new instance of Data.

回复收藏 0 原文

一抹苦笑 2024-10-26 09:48:58

我尝试做同样的事情，但是子类化 ndarray 确实非常复杂。

如果您只需要添加一些功能，我建议创建一个将数组存储为属性的类。

class Data(object):

    def __init__(self, array):
        self.array = array

    def remove_some(self, t):
        //operate on self.array
        pass

d = Data(z)
print(d.array)

I tried to do the same, but it is really very complex to subclass ndarray.

If you only have to add some functionality, I would suggest to create a class which stores the array as attribute.

class Data(object):

    def __init__(self, array):
        self.array = array

    def remove_some(self, t):
        //operate on self.array
        pass

d = Data(z)
print(d.array)

回复收藏 0 原文

~没有更多了~