设置“入”运算符:使用相等还是相同?
class A(object):
def __cmp__(self):
print '__cmp__'
return object.__cmp__(self)
def __eq__(self, rhs):
print '__eq__'
return True
a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])
为什么第一行打印 True,但第二行打印 False?并且都不输入运算符eq?
我正在使用Python 2.6
class A(object):
def __cmp__(self):
print '__cmp__'
return object.__cmp__(self)
def __eq__(self, rhs):
print '__eq__'
return True
a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])
Why does first line prints True, but second prints False? And neither enters operator eq?
I am using Python 2.6
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
一个切题的答案,但你的问题和我的测试让我很好奇。如果您忽略集合运算符,这是您的 __hash__ 问题的根源,那么事实证明您的问题仍然很有趣。
感谢我在 这个问题,我能够通过源代码追踪 in 运算符到它的根。在底部附近,我发现 PyObject_RichCompareBool 函数确实在测试相等性之前测试身份(请参阅有关“快速结果”的评论)。
因此,除非我误解了事情的运作方式,否则你的问题的技术答案首先是身份,然后是平等,通过平等测试本身。只是重申一下,这不是您所看到的行为的根源,而只是您问题的技术答案。
如果我误解了来源,请有人纠正我。
A tangential answer, but your question and my testing made me curious. If you ignore the set operator which is the source of your
__hash__
problem, it turns out your question is still interesting.Thanks to the help I got on this SO question, I was able to chase the in operator through the source code to it's root. Near the bottom I found the PyObject_RichCompareBool function which indeed tests for identity (see the comment about "Quick result") before testing for equality.
So unless I misunderstand the way things work, the technical answer to your question is first identity and then equality, through the equality test itself. Just to reiterate, that is not the source of the behavior you were seeing but just the technical answer to your question.
If I misunderstood the source, somebody please set me straight.
在比较相等性之前,集合似乎使用哈希码,然后使用身份。以下代码:
输出:
发生的情况似乎是:
in
运算符检查的对象的哈希码。Sets seem to use hash codes, then identity, before comparing for equality. The following code:
outputs:
What happens seems to be:
in
operator is computed.Set __contains__ 按以下顺序进行检查:
相关的 C 源代码位于 Objects/setobject.c::set_lookkey() 和 Objects/object.c::PyObject_RichCompareBool() 中。
Set __contains__ makes checks in the following order:
The relevant C source code is in Objects/setobject.c::set_lookkey() and in Objects/object.c::PyObject_RichCompareBool().
您需要定义
__hash__
也。例如将按预期工作。
作为一般规则,任何时候您实现
__cmp__
您应该实现一个__hash__
,使得对于所有x
和y
使得x == y
>,x.__hash__() == y.__hash__()
。You need to define
__hash__
too. For exampleWill work as expected.
As a general rule, any time you implement
__cmp__
you should implement a__hash__
such that for allx
andy
such thatx == y
,x.__hash__() == y.__hash__()
.集合和字典通过使用散列作为完全相等检查的快速近似来提高速度。如果要重新定义相等性,通常需要重新定义哈希算法,使其一致。
默认的哈希函数使用对象的标识,这作为完全相等的快速近似几乎没有用处,但至少允许您使用任意类实例作为字典键并检索与其一起存储的值(如果您准确地传递了与键相同的对象。但这意味着,如果您重新定义相等性并且不重新定义哈希函数,您的对象将进入字典/集合,而不会抱怨不可哈希,但实际上仍然不会按照您期望的方式工作他们到。
有关更多详细信息,请参阅 有关
__hash__ 的官方 Python 文档
。Sets and dictionaries gain their speed by using hashing as a fast approximation of full equality checking. If you want to redefine equality, you usually need to redefine the hash algorithm so that it is consistent.
The default hash function uses the identity of the object, which is pretty useless as a fast approximation of full equality, but at least allows you to use an arbitrary class instance as a dictionary key and retrieve the value stored with it if you pass exactly the same object as a key. But it means if you redefine equality and don't redefine the hash function, your objects will go into a dictionary/set without complaining about not being hashable, but still won't actually work the way you expect them to.
See the official python docs on
__hash__
for more details.