如何为namedtuple的子类提供额外的初始化?

发布于 2024-09-17 08:27:13 字数 579 浏览 9 评论 0原文

假设我有一个像这样的 namedtuple

EdgeBase = namedtuple("EdgeBase", "left, right")

我想为此实现一个自定义哈希函数,因此我创建以下子类:

class Edge(EdgeBase):
    def __hash__(self):
        return hash(self.left) * hash(self.right)

由于该对象是不可变的,我希望仅计算哈希值一次,所以我这样做:

class Edge(EdgeBase):
    def __init__(self, left, right):
        self._hash = hash(self.left) * hash(self.right)

    def __hash__(self):
        return self._hash

这似乎有效,但我真的不确定Python中的子类化和初始化,尤其是元组。这个解决方案有什么陷阱吗?有推荐的方法吗?还好吗?提前致谢。

Suppose I have a namedtuple like this:

EdgeBase = namedtuple("EdgeBase", "left, right")

I want to implement a custom hash-function for this, so I create the following subclass:

class Edge(EdgeBase):
    def __hash__(self):
        return hash(self.left) * hash(self.right)

Since the object is immutable, I want the hash-value to be calculated only once, so I do this:

class Edge(EdgeBase):
    def __init__(self, left, right):
        self._hash = hash(self.left) * hash(self.right)

    def __hash__(self):
        return self._hash

This appears to be working, but I am really not sure about subclassing and initialization in Python, especially with tuples. Are there any pitfalls to this solution? Is there a recommended way how to do this? Is it fine? Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

心的位置 2024-09-24 08:27:15

在 Python 3.7+ 中,您现在可以使用 数据类 轻松构建可哈希类。

Code

假设 leftrightint 类型,我们通过 unsafe_hash 使用默认哈希>+ 关键字:

import dataclasses as dc


@dc.dataclass(unsafe_hash=True)
class Edge:
    left: int
    right: int


hash(Edge(1, 2))
# 3713081631934410656

现在我们可以使用这些(可变的)可哈希对象作为集合中的元素或(字典中的键)。

{Edge(1, 2), Edge(1, 2), Edge(2, 1), Edge(2, 3)}
# {Edge(left=1, right=2), Edge(left=2, right=1), Edge(left=2, right=3)}

详细信息

我们也可以重写__hash__函数:

@dc.dataclass
class Edge:
    left: int
    right: int

    def __post_init__(self):
        # Add custom hashing function here
        self._hash = hash((self.left, self.right))         # emulates default

    def __hash__(self):
        return self._hash


hash(Edge(1, 2))
# 3713081631934410656

扩展@ShadowRanger的评论,OP的自定义哈希函数不可靠。特别是,属性值可以互换,例如 hash(Edge(1, 2)) == hash(Edge(2, 1)),这可能是无意的。

+注意,名称“不安全”表明尽管是可变对象,仍将使用默认哈希。这可能是不受欢迎的,特别是在需要不可变键的字典中。可以使用适当的关键字打开不可变哈希。另请参阅有关数据类中的哈希逻辑相关问题

In Python 3.7+, you can now use dataclasses to build hashable classes with ease.

Code

Assuming int types of left and right, we use the default hashing via unsafe_hash+ keyword:

import dataclasses as dc


@dc.dataclass(unsafe_hash=True)
class Edge:
    left: int
    right: int


hash(Edge(1, 2))
# 3713081631934410656

Now we can use these (mutable) hashable objects as elements in a set or (keys in a dict).

{Edge(1, 2), Edge(1, 2), Edge(2, 1), Edge(2, 3)}
# {Edge(left=1, right=2), Edge(left=2, right=1), Edge(left=2, right=3)}

Details

We can alternatively override the __hash__ function:

@dc.dataclass
class Edge:
    left: int
    right: int

    def __post_init__(self):
        # Add custom hashing function here
        self._hash = hash((self.left, self.right))         # emulates default

    def __hash__(self):
        return self._hash


hash(Edge(1, 2))
# 3713081631934410656

Expanding on @ShadowRanger's comment, the OP's custom hash function is not reliable. In particular, the attribute values can be interchanged, e.g. hash(Edge(1, 2)) == hash(Edge(2, 1)), which is likely unintended.

+Note, the name "unsafe" suggests the default hash will be used despite being a mutable object. This may be undesired, particularly within a dict expecting immutable keys. Immutable hashing can be turned on with the appropriate keywords. See also more on hashing logic in dataclasses and a related issue.

北恋 2024-09-24 08:27:15

问题中的代码可以受益于 __init__ 中的超级调用,以防它在多重继承情况下被子类化,但在其他方面是正确的。

class Edge(EdgeBase):
    def __init__(self, left, right):
        super(Edge, self).__init__(left, right)
        self._hash = hash(self.left) * hash(self.right)

    def __hash__(self):
        return self._hash

虽然元组是只读的,但其子类的元组部分是只读的,但其他属性可以照常写入,这就是允许对 _hash 进行赋值的原因,无论它是在 __init__ 中还是在 __new__ 中完成。代码>.您可以通过将子类的 __slots__ 设置为 () 来使子类完全只读,这具有节省内存的额外好处,但随后您将无法分配给 _hash。

The code in the question could benefit from a super call in the __init__ in case it ever gets subclassed in a multiple inheritance situation, but otherwise is correct.

class Edge(EdgeBase):
    def __init__(self, left, right):
        super(Edge, self).__init__(left, right)
        self._hash = hash(self.left) * hash(self.right)

    def __hash__(self):
        return self._hash

While tuples are readonly only the tuple parts of their subclasses are readonly, other properties may be written as usual which is what allows the assignment to _hash regardless of whether it's done in __init__ or __new__. You can make the subclass fully readonly by setting it's __slots__ to (), which has the added benefit of saving memory, but then you wouldn't be able to assign to _hash.

猫卆 2024-09-24 08:27:14

2017 年编辑: 结果是 namedtuple 不是一个好主意attrs 是现代的替代方案。

class Edge(EdgeBase):
    def __new__(cls, left, right):
        self = super(Edge, cls).__new__(cls, left, right)
        self._hash = hash(self.left) * hash(self.right)
        return self

    def __hash__(self):
        return self._hash

__new__ 是您想要在此处调用的内容,因为元组是不可变的。不可变对象在 __new__ 中创建,然后返回给用户,而不是在 __init__ 中填充数据。

cls 必须两次传递给 __new__ 上的 super 调用,因为出于历史/奇怪的原因,__new__ 是隐式的静态方法。

edit for 2017: turns out namedtuple isn't a great idea. attrs is the modern alternative.

class Edge(EdgeBase):
    def __new__(cls, left, right):
        self = super(Edge, cls).__new__(cls, left, right)
        self._hash = hash(self.left) * hash(self.right)
        return self

    def __hash__(self):
        return self._hash

__new__ is what you want to call here because tuples are immutable. Immutable objects are created in __new__ and then returned to the user, instead of being populated with data in __init__.

cls has to be passed twice to the super call on __new__ because __new__ is, for historical/odd reasons implicitly a staticmethod.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文