Python 中不可散列的查找表

发布于 2024-10-08 03:10:09 字数 399 浏览 11 评论 0原文

我需要创建从我自己的自定义类(从 dict 派生)的对象到另一个自定义类的对象的映射。据我所知,有两种方法可以做到这一点:

  1. 我可以使对象可散列。我不知道我会怎么做。我知道我可以实现 __hash__() 但我不确定如何实际计算哈希(应该是整数)。

  2. 由于可以比较我的对象,我可以创建一个列表 [(myobj, myotherobj)],然后实现一个查找,查找元组中的第一项与查找键相同的元组。实现这个很简单(对象的数量很少),但如果标准库中已经存在这样的东西,我想避免重新发明轮子。

在我看来,想要查找不可散列的内容是一个常见问题,所以我认为有人已经解决了这个问题。关于如何为类似字典的对象实现 __hash()__ 有什么建议,或者是否有其他标准方法来制作不可散列的查找表?

I need to create a mapping from objects of my own custom class (derived from dict) to objects of another custom class. As I see it there are two ways of doing this:

  1. I can make the objects hashable. I'm not sure how I would do this. I know I can implement __hash__() but I'm unsure how to actually calculate the hash (which should be an integer).

  2. Since my objects can be compared I can make a list [(myobj, myotherobj)] and then implement a lookup which finds the tuple where the first item in the tuple is the same as the lookup key. Implementing this is trivial (the number of objects is small) but I want to avoid reinventing the wheel if something like this already exists in the standard library.

It seems to me that wanting to look up unhashables would be a common problem so I assume someone has already solved this problem. Any suggestions on how to implement __hash()__ for a dict-like object or if there is some other standard way of making lookup tables of unhashables?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

鲸落 2024-10-15 03:10:09

使用可变对象作为键的映射通常很困难。这真的是你想要的吗?如果你认为你的对象是不可变的(在Python中没有办法真正强制不可变性),或者你知道它们在映射中用作键时不会改变,你可以为它们实现你自己的哈希函数有几种方法。例如,如果您的对象仅具有可散列数据成员,则可以返回所有数据成员的元组的散列作为对象散列。

如果您的对象是类似字典的,您可以使用 frozenset

def __hash__(self):
    return hash(frozenset(self.iteritems()))

仅当所有值都是可散列的时,这才有效。为了节省哈希值的重新计算(这将在每次查找时完成),您可以缓存哈希值,并在设置了某些脏标志时重新计算它。

Mappings with mutable objects as keys are generally difficult. Is that really what you want? If you consider your objects to be immutable (there is no way to really enforce immutability in Python), or you know they will not be changed while they are used as keys in a mapping, you can implement your own hash-function for them in several ways. For instance, if your object only has hashable data-members, you can return the hash of a tuple of all data-members as the objects hash.

If your object is a dict-like, you can use the hash of a frozenset of all key-value-pairs.

def __hash__(self):
    return hash(frozenset(self.iteritems()))

This only works if all values are hashable. In order to save recalculations of the hashes (which would be done on every lookup), you can cache the hash-value and just recalculate it if some dirty-flag is set.

蘸点软妹酱 2024-10-15 03:10:09

一个简单的解决方案似乎是使用 lookup[id(myobj)] = myotherobj 而不是 lookup[myobj] = myotherobj。对这种做法有什么评论吗?

A simple solution seems to be to do lookup[id(myobj)] = myotherobj instead of lookup[myobj] = myotherobj. Any commente on this approach?

挖个坑埋了你 2024-10-15 03:10:09

如果您不在自定义类中存储任何其他不可散列的对象,则以下内容应该有效:

def __hash__(self):
    return hash(self.items())

The following should work if you're not storing any additional unhashable objects in your custom class:

def __hash__(self):
    return hash(self.items())
可爱咩 2024-10-15 03:10:09

以下是 frozendict 的实现,取自 http://code。 activestate.com/recipes/414283/

class frozendict(dict):
    def _blocked_attribute(obj):
        raise AttributeError, "A frozendict cannot be modified."
    _blocked_attribute = property(_blocked_attribute)

    __delitem__ = __setitem__ = clear = _blocked_attribute
    pop = popitem = setdefault = update = _blocked_attribute

    def __new__(cls, *args):
        new = dict.__new__(cls)
        dict.__init__(new, *args)
        return new

    def __init__(self, *args):
        pass

    def __hash__(self):
        try:
            return self._cached_hash
        except AttributeError:
            h = self._cached_hash = hash(tuple(sorted(self.items())))
            return h

    def __repr__(self):
        return "frozendict(%s)" % dict.__repr__(self)

我将用 frozenset(self.iteritems()) 替换 tuple(sorted(self.items()))Spacecowboy 的回答。并考虑将 __slots__ = ("_cached_hash",) 添加到类中。

Here is an implementation of a frozendict, taken from http://code.activestate.com/recipes/414283/:

class frozendict(dict):
    def _blocked_attribute(obj):
        raise AttributeError, "A frozendict cannot be modified."
    _blocked_attribute = property(_blocked_attribute)

    __delitem__ = __setitem__ = clear = _blocked_attribute
    pop = popitem = setdefault = update = _blocked_attribute

    def __new__(cls, *args):
        new = dict.__new__(cls)
        dict.__init__(new, *args)
        return new

    def __init__(self, *args):
        pass

    def __hash__(self):
        try:
            return self._cached_hash
        except AttributeError:
            h = self._cached_hash = hash(tuple(sorted(self.items())))
            return h

    def __repr__(self):
        return "frozendict(%s)" % dict.__repr__(self)

I would replace tuple(sorted(self.items())) by frozenset(self.iteritems()) as in Spacecowboy's answer. And consider adding __slots__ = ("_cached_hash",) to the class.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文