Python 中不可散列的查找表
我需要创建从我自己的自定义类(从 dict 派生)的对象到另一个自定义类的对象的映射。据我所知,有两种方法可以做到这一点:
我可以使对象可散列。我不知道我会怎么做。我知道我可以实现 __hash__() 但我不确定如何实际计算哈希(应该是整数)。
由于可以比较我的对象,我可以创建一个列表 [(myobj, myotherobj)],然后实现一个查找,查找元组中的第一项与查找键相同的元组。实现这个很简单(对象的数量很少),但如果标准库中已经存在这样的东西,我想避免重新发明轮子。
在我看来,想要查找不可散列的内容是一个常见问题,所以我认为有人已经解决了这个问题。关于如何为类似字典的对象实现 __hash()__ 有什么建议,或者是否有其他标准方法来制作不可散列的查找表?
I need to create a mapping from objects of my own custom class (derived from dict) to objects of another custom class. As I see it there are two ways of doing this:
I can make the objects hashable. I'm not sure how I would do this. I know I can implement
__hash__()
but I'm unsure how to actually calculate the hash (which should be an integer).Since my objects can be compared I can make a list [(myobj, myotherobj)] and then implement a lookup which finds the tuple where the first item in the tuple is the same as the lookup key. Implementing this is trivial (the number of objects is small) but I want to avoid reinventing the wheel if something like this already exists in the standard library.
It seems to me that wanting to look up unhashables would be a common problem so I assume someone has already solved this problem. Any suggestions on how to implement __hash()__
for a dict-like object or if there is some other standard way of making lookup tables of unhashables?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用可变对象作为键的映射通常很困难。这真的是你想要的吗?如果你认为你的对象是不可变的(在Python中没有办法真正强制不可变性),或者你知道它们在映射中用作键时不会改变,你可以为它们实现你自己的哈希函数有几种方法。例如,如果您的对象仅具有可散列数据成员,则可以返回所有数据成员的元组的散列作为对象散列。
如果您的对象是类似字典的,您可以使用 frozenset。
仅当所有值都是可散列的时,这才有效。为了节省哈希值的重新计算(这将在每次查找时完成),您可以缓存哈希值,并在设置了某些脏标志时重新计算它。
Mappings with mutable objects as keys are generally difficult. Is that really what you want? If you consider your objects to be immutable (there is no way to really enforce immutability in Python), or you know they will not be changed while they are used as keys in a mapping, you can implement your own hash-function for them in several ways. For instance, if your object only has hashable data-members, you can return the hash of a tuple of all data-members as the objects hash.
If your object is a dict-like, you can use the hash of a frozenset of all key-value-pairs.
This only works if all values are hashable. In order to save recalculations of the hashes (which would be done on every lookup), you can cache the hash-value and just recalculate it if some dirty-flag is set.
一个简单的解决方案似乎是使用
lookup[id(myobj)] = myotherobj
而不是lookup[myobj] = myotherobj
。对这种做法有什么评论吗?A simple solution seems to be to do
lookup[id(myobj)] = myotherobj
instead oflookup[myobj] = myotherobj
. Any commente on this approach?如果您不在自定义类中存储任何其他不可散列的对象,则以下内容应该有效:
The following should work if you're not storing any additional unhashable objects in your custom class:
以下是
frozendict
的实现,取自 http://code。 activestate.com/recipes/414283/:我将用
frozenset(self.iteritems())
替换tuple(sorted(self.items()))
如 Spacecowboy 的回答。并考虑将 __slots__ = ("_cached_hash",) 添加到类中。Here is an implementation of a
frozendict
, taken from http://code.activestate.com/recipes/414283/:I would replace
tuple(sorted(self.items()))
byfrozenset(self.iteritems())
as in Spacecowboy's answer. And consider adding__slots__ = ("_cached_hash",)
to the class.