Python __eq__ 中是否需要传递性?
我正在使用自定义 __eq__
实现我自己的类。对于数学意义上不“相等”,但模糊方式“匹配”的事物,我想返回 True。
然而,这样做的一个问题是,这会导致数学意义上的传递性丧失,即 a == b && b ==c
,而a
可能不等于c
。
问题:Python 是否依赖于 __eq__ 的传递性?我想要做的事情会破坏事情吗?或者只要我小心翼翼地不假设传递性就可以做到这一点吗?
用例
我想要将电话号码彼此匹配,而这些电话号码可以是国际格式的,也可以仅供国内使用(没有指定国家/地区代码)。如果没有指定国家/地区代码,我希望一个数字等于带有 1 的数字,但如果指定了,它应该只等于具有相同国家/地区代码或不带有 1 的数字。
所以:
- 当然,
+31 6 12345678
应该等于+31 6 12345678
,而06 12345678
应该等于06 12345678
> +31 6 12345678
应等于06 12345678
(和 vv)+49 6 12345678
应该等于06 12345678
(和 vv)- 但是
+31 6 12345678
不应该是等于+49 6 12345678
我不需要散列(所以不会实现它),这样至少让生活变得更轻松。
I'm implementing my own class, with custom __eq__
. And I'd like to return True
for things that are not "equal" in a mathematical sense, but "match" in a fuzzy way.
An issue with this is, however, that this leads to loss of transitivity in a mathematical sense, i.e. a == b && b ==c
, while a
may not be equal to c
.
Question: is Python dependent on __eq__
being transitive? Will what I'm trying to do break things, or is it possible to do this as long as I'm careful myself not to assume transitivity?
Use case
I want to match telephone numbers with one another, while those may be either formatted internationally, or just for domestic use (without a country code specified). If there's no country code specified, I'd like a number to be equal to a number with one, but if it is specified, it should only be equal to numbers with the same country-code, or without one.
So:
- Of course,
+31 6 12345678
should equal+31 6 12345678
, and06 12345678
should equal06 12345678
+31 6 12345678
should equal06 12345678
(and v.v.)+49 6 12345678
should equal06 12345678
(and v.v.)- But
+31 6 12345678
should not be equal to+49 6 12345678
I don't have a need for hashing (and so won't implement it), so that at least makes life easier.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
发布评论
评论(2)
__eq__
方法应该是传递的;至少字典是这么认为的。
class A:
def __init__(self, name):
self.name = name
def __eq__(self, other):
for element in self.values:
if element is other:
return True
return False
def __hash__(self):
return 0
def __repr__(self):
return self.name
x, y, z = A('x'), A('y'), A('z')
x.values = [x,y]
y.values = [x,y,z]
z.values = [y,z]
print(x == y)
--> True
print (y == z)
--> True
print(x == z)
--> False
print({**{x:1},**{y:2, z: 3}})
--> {x: 3}
print({**{x:1},**{z:3, y:2}})
--> {x: 1, z: 2}
{**{x:1},**{y: 2, z:3}}
是两个字典的并集。没有人希望字典在更新后删除某个键。
print(z in {**{x:1},**{y:2, z: 3}})
--> False
通过更改联合中的顺序,您甚至可以获得不同大小的字典:
print(len({**{x:1},**{y:2, z: 3}}))
--> 1
print(len({**{x:1},**{z:3, y:2}}))
--> 2
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
对于与通常理解的关系一致的比较,没有“必须”关系,而是“应该”关系。 Python 明确地不强制执行此操作,并且
float
是一种内置类型,由于float("nan")
而具有不同的行为。不过,请记住,异常非常罕见,并且很容易被忽略:大多数人会对待
float
例如,具有全序。使用不常见的比较关系会严重增加维护工作量。通过运算符对“模糊匹配”进行建模的规范方法是使用不对称运算符作为子集、子序列或包含。
set
和frozenset
支持>
、>=
等,表示一个集合包含所有值另一个。str
和bytes
支持in
来指示子序列被覆盖。range
和ipaddress
网络支持in
来指示覆盖特定项目。值得注意的是,虽然这些运算符可能是传递的,但它们不是对称的。例如,
a >= b 和 c >= b
并不意味着b >= c
,因此也不是a >= c
> 或反之亦然。实际上,可以将“不带国家/地区代码的号码”建模为同一号码的“带国家/地区代码的号码”的超集。这意味着
06 12345678 >= +31 6 12345678
和06 12345678 >= +49 6 12345678
但反之则不然。为了进行对称比较,可以使用a >= b 或 b >= a
而不是a == b
。There is no MUST but a SHOULD relation for comparisons being consistent with the commonly understood relations. Python expressively does not enforce this and
float
is an inbuilt type with different behaviour due tofloat("nan")
.Still, keep in mind that exceptions are incredibly rare and subject to being ignored: most people would treat
float
as having total order, for example. Using uncommon comparison relations can seriously increase maintenance effort.Canonical ways to model "fuzzy matching" via operators are as subset, subsequence or containment using unsymmetric operators.
set
andfrozenset
support>
,>=
and so on to indicate that one set encompases all values of another.str
andbytes
supportin
to indicate that subsequences are covered.range
andipaddress
Networks supportin
to indicate that specific items are covered.Notably, while these operators may be transitive they are not symmetric. For example,
a >= b and c >= b
does not implyb >= c
and thus nota >= c
or vice versa.Practically, one could model "number without country code" as the superset of "number with country code" for the same number. This means that
06 12345678 >= +31 6 12345678
and06 12345678 >= +49 6 12345678
but not vice versa. In order to do a symmetric comparison, one would usea >= b or b >= a
instead ofa == b
.