设置“入”运算符:使用相等还是相同?

发布于 2025-01-01 01:10:20 字数 346 浏览 5 评论 0原文

class A(object):
    def __cmp__(self):
        print '__cmp__'
        return object.__cmp__(self)

    def __eq__(self, rhs):
        print '__eq__'
        return True
a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])

为什么第一行打印 True,但第二行打印 False?并且都不输入运算符eq

我正在使用Python 2.6

class A(object):
    def __cmp__(self):
        print '__cmp__'
        return object.__cmp__(self)

    def __eq__(self, rhs):
        print '__eq__'
        return True
a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])

Why does first line prints True, but second prints False? And neither enters operator eq?

I am using Python 2.6

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

后知后觉 2025-01-08 01:10:21

一个切题的答案,但你的问题和我的测试让我很好奇。如果您忽略集合运算符,这是您的 __hash__ 问题的根源,那么事实证明您的问题仍然很有趣。

感谢我在 这个问题,我能够通过源代码追踪 in 运算符到它的根。在底部附近,我发现 PyObject_RichCompareBool 函数确实在测试相等性之前测试身份(请参阅有关“快速结果”的评论)。

因此,除非我误解了事情的运作方式,否则你的问题的技术答案首先是身份,然后是平等,通过平等测试本身。只是重申一下,这不是您所看到的行为的根源,而只是您问题的技术答案。

如果我误解了来源,请有人纠正我。

int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
    PyObject *res;
    int ok;

    /* Quick result when objects are the same.
       Guarantees that identity implies equality. */
    if (v == w) {
        if (op == Py_EQ)
            return 1;
        else if (op == Py_NE)
            return 0;
    }

    res = PyObject_RichCompare(v, w, op);
    if (res == NULL)
        return -1;
    if (PyBool_Check(res))
        ok = (res == Py_True);
    else
        ok = PyObject_IsTrue(res);
    Py_DECREF(res);
    return ok;
}

A tangential answer, but your question and my testing made me curious. If you ignore the set operator which is the source of your __hash__ problem, it turns out your question is still interesting.

Thanks to the help I got on this SO question, I was able to chase the in operator through the source code to it's root. Near the bottom I found the PyObject_RichCompareBool function which indeed tests for identity (see the comment about "Quick result") before testing for equality.

So unless I misunderstand the way things work, the technical answer to your question is first identity and then equality, through the equality test itself. Just to reiterate, that is not the source of the behavior you were seeing but just the technical answer to your question.

If I misunderstood the source, somebody please set me straight.

int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
    PyObject *res;
    int ok;

    /* Quick result when objects are the same.
       Guarantees that identity implies equality. */
    if (v == w) {
        if (op == Py_EQ)
            return 1;
        else if (op == Py_NE)
            return 0;
    }

    res = PyObject_RichCompare(v, w, op);
    if (res == NULL)
        return -1;
    if (PyBool_Check(res))
        ok = (res == Py_True);
    else
        ok = PyObject_IsTrue(res);
    Py_DECREF(res);
    return ok;
}
吾性傲以野 2025-01-08 01:10:21

在比较相等性之前,集合似乎使用哈希码,然后使用身份。以下代码:

class A(object):
    def __eq__(self, rhs):
        print '__eq__'
        return True
    def __hash__(self):
        print '__hash__'
        return 1

a1 = A()
a2 = A()

print 'set1'
set1 = set([a1])

print 'set2'
set2 = set([a2])

print 'a1 in set1'
print a1 in set1

print 'a1 in set2'
print a1 in set2

输出:

set1
__hash__
set2
__hash__
a1 in set1
__hash__
True
a1 in set2
__hash__
__eq__
True

发生的情况似乎是:

  1. 将元素插入散列时计算散列码。 (与现有元素进行比较。)
  2. 计算您使用 in 运算符检查的对象的哈希码。
  3. 通过首先检查它们是否与您要查找的对象是同一对象,或者它们在逻辑上是否相等来检查具有相同哈希码的集合中的元素。

Sets seem to use hash codes, then identity, before comparing for equality. The following code:

class A(object):
    def __eq__(self, rhs):
        print '__eq__'
        return True
    def __hash__(self):
        print '__hash__'
        return 1

a1 = A()
a2 = A()

print 'set1'
set1 = set([a1])

print 'set2'
set2 = set([a2])

print 'a1 in set1'
print a1 in set1

print 'a1 in set2'
print a1 in set2

outputs:

set1
__hash__
set2
__hash__
a1 in set1
__hash__
True
a1 in set2
__hash__
__eq__
True

What happens seems to be:

  1. The hash code is computed when an element is inserted into a hash. (To compare with the existing elements.)
  2. The hash code for the object you're checking with the in operator is computed.
  3. Elements of the set with the same hash code are inspected by first checking whether they're the same object as the one you're looking for, or if they're logically equal to it.
橙味迷妹 2025-01-08 01:10:20

Set __contains__ 按以下顺序进行检查:

 'Match' if hash(a) == hash(b) and (a is b or a==b) else 'No Match'

相关的 C 源代码位于 Objects/setobject.c::set_lookkey() 和 Objects/object.c::PyObject_RichCompareBool() 中。

Set __contains__ makes checks in the following order:

 'Match' if hash(a) == hash(b) and (a is b or a==b) else 'No Match'

The relevant C source code is in Objects/setobject.c::set_lookkey() and in Objects/object.c::PyObject_RichCompareBool().

冰雪之触 2025-01-08 01:10:20

您需要定义 __hash__也。例如

class A(object):
    def __hash__(self):
        print '__hash__'
        return 42

    def __cmp__(self, other):
        print '__cmp__'
        return object.__cmp__(self, other)

    def __eq__(self, rhs):
        print '__eq__'
        return True

a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])

将按预期工作。

作为一般规则,任何时候您实现 __cmp__ 您应该实现一个 __hash__ ,使得对于所有 xy 使得 x == y >, x.__hash__() == y.__hash__()

You need to define __hash__ too. For example

class A(object):
    def __hash__(self):
        print '__hash__'
        return 42

    def __cmp__(self, other):
        print '__cmp__'
        return object.__cmp__(self, other)

    def __eq__(self, rhs):
        print '__eq__'
        return True

a1 = A()
a2 = A()
print a1 in set([a1])
print a1 in set([a2])

Will work as expected.

As a general rule, any time you implement __cmp__ you should implement a __hash__ such that for all x and y such that x == y, x.__hash__() == y.__hash__().

手心的温暖 2025-01-08 01:10:20

集合和字典通过使用散列作为完全相等检查的快速近似来提高速度。如果要重新定义相等性,通常需要重新定义哈希算法,使其一致。

默认的哈希函数使用对象的标识,这作为完全相等的快速近似几乎没有用处,但至少允许您使用任意类实例作为字典键并检索与其一起存储的值(如果您准确地传递了与键相同的对象。但这意味着,如果您重新定义相等性并且重新定义哈希函数,您的对象将进入字典/集合,而不会抱怨不可哈希,但实际上仍然不会按照您期望的方式工作他们到。

有关更多详细信息,请参阅 有关 __hash__ 的官方 Python 文档

Sets and dictionaries gain their speed by using hashing as a fast approximation of full equality checking. If you want to redefine equality, you usually need to redefine the hash algorithm so that it is consistent.

The default hash function uses the identity of the object, which is pretty useless as a fast approximation of full equality, but at least allows you to use an arbitrary class instance as a dictionary key and retrieve the value stored with it if you pass exactly the same object as a key. But it means if you redefine equality and don't redefine the hash function, your objects will go into a dictionary/set without complaining about not being hashable, but still won't actually work the way you expect them to.

See the official python docs on __hash__ for more details.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文