可变集合是否应该覆盖 equals 和 hashCode?
我只是想知道为可变集合重写 equals 和 hashCode 是否是一个好主意。这意味着,如果我将这样的集合插入到 HashSet
中,然后修改该集合,则 HashSet
将无法再找到该集合。这是否意味着只有不可变集合才应该覆盖 equals
和 hashCode
,或者这只是 Java 程序员所忍受的麻烦?
I was just wondering if it was a good idea to override equals
and hashCode
for mutable collections. This would imply that if I insert such a collection into a HashSet
and then modify the collection, the HashSet
would no longer be able to find the collection. Does this imply that only immutable collections should override equals
and hashCode
, or is this a nuisance Java programmers simply live with?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
如果您的类应该像值类型一样运行,您应该重写
equals
和hashCode
。对于集合来说,情况通常并非如此。(我真的没有太多 Java 经验。这个答案是基于 C# 的。)
You should override
equals
andhashCode
if your class should act like it were a value type. This usually is not the case for collections.(I don't really have much Java experience. This answer is based on C#.)
深浅等于的问题比Java还要大;所有面向对象的语言都必须关注它。
添加到集合中的对象应覆盖 equals 和哈希码,但集合接口的抽象实现中内置的默认行为足以满足集合本身的需求。
The problem of deep and shallow equals is bigger than Java; all object oriented languages have to concern themselves with it.
The objects that you add to the collection should override equals and hash code, but the default behavior built into the abstract implementation of the collection interface suffices for the collection itself.
这与任何可变类相同。当你将一个实例插入到 HashSet 中然后调用变异方法时,你会遇到麻烦。所以,我的回答是:是的,如果有用的话。
当然,在将集合添加到 HashSet 之前,您可以使用不可变的包装器。
It's the same as with any mutable class. When you insert an instance into a HashSet and then call a mutating method, you will get into trouble. So, my answer is: Yes, if there's a use for it.
You can of course use an immutable Wrapper for your Collection before adding it to the HashSet.
我认为更大的问题是,如果有人尝试将 FredCollection 的实例两次添加到 Set 中,会发生什么情况。
此后
set
的size()
应该是2
还是1
?您是否需要测试
FredCollection
的两个不同实例的“相等性”?我认为这个问题的答案对于确定您的equals()
/hashcode()
行为比其他任何事情都更重要。I think the bigger question is what should happen if someone attempts to add an instance of your FredCollection to a Set twice.
Should the
size()
ofset
be2
or1
after this?Will you ever have a need to test the "equality" of two different instances of
FredCollection
? I think the answer to this question is more important at determining yourequals()
/hashcode()
behavior than anything else.这不仅是集合的问题,也是一般可变对象的问题(另一个示例:
Point2D
)。是的,这是 Java 程序员最终学会考虑的一个潜在问题。This is not just an issue for collections, but for mutable objects in general (another example:
Point2D
). And yes, it is a potential problem that Java programmers eventually learn to take into account.您不应覆盖 equals 和 hashCode,以便它们反映可变成员。
更多的是我个人的观点。我认为哈希码和等于是技术术语,不应该用于实现业务逻辑。想象一下:您有两个对象(不仅仅是集合)并询问它们是否相等,那么有两种不同的方法来回答它们:
但是因为 equals 是由技术性的东西(HashMap)使用的,所以你应该以技术的方式实现它,并通过其他东西(比如比较器接口)构建与 equals 相关的业务逻辑。对于您的集合来说,这意味着:不要覆盖 equals 和 hashCode(以违反技术合同的方式:
(地图的java文档)
)。
You should not override equals and hashCode so that they reflect the mutable member.
Is more my personal point of view. I think hash code and equals are technical terms that should not be used to implement business logic. Imagine: you have two Objects (not only Collections) and ask if they are equals, then there are two different ways to answer them:
But because equals is used by technical stuff (HashMap), you should implement it in a technical way, and build the business logic related equals by something else (something like the comparator interface). And for your collection it means: do not override equals and hashCode (in a way that breaks the technical contract:
(java doc of Map)
).
equals
和hashCode
的一个基本困难是,可以通过两种逻辑方式定义等价关系:某个类的一些使用者会想要一个定义,而同一类的其他使用者会想要另一个定义。我将这两个等价关系定义如下:
如果用对 Y 的引用覆盖 X 不会改变 X 任何成员的当前或未来行为,则两个对象引用 X 和 Y 是完全等价的
如果在未持久保存从身份相关哈希函数返回的值的程序中,将所有对 X 的引用与对 Y 的所有引用交换将使程序状态保持不变,则两个对象引用 X 和 Y 具有等效状态。
两个
请注意,第二个定义主要与常见场景相关,其中两个事物持有对某些可变类型(例如数组)的对象的引用,但可以确定,至少在某个特定的感兴趣的时间范围内,这些对象不是将暴露于任何可能使他们变异的事物。在这种情况下,如果“持有者”对象在所有其他方面都是等效的,那么它们的等效性应该取决于它们所持有的对象是否满足上面的等效性的第二个定义。
请注意,第二个定义本身并不关心对象状态如何更改的任何细节。进一步注意,对于任一等价定义,不可变对象可以将具有相同内容的不同对象报告为相等或不相等(如果 X 和 Y 唯一不同的方式是 X.Equals(X) 报告true 而 X.Equals(Y) 报告 false,这将是一个区别,但让此类对象使用第一个等价关系的引用标识和第二个等价关系的其他方面的等价可能是最有用的
,因为只有 Java 。提供一对等价定义类,类设计者必须猜测哪个等价定义与该类的使用者最相关,虽然有充分的理由支持始终使用第一个,但第二个通常更实际。第二个最大的问题是类无法知道使用该类的代码何时需要第一个等价关系。
A fundamental difficulty with
equals
andhashCode
is that there are two logical ways one may define an equivalence relation; some consumers of a class will want one definition, while other consumers of that same class will want another.I would define the two equivalence relations as follows:
Two object references X and Y are fully equivalent if overwriting X with a reference to Y would not alter the present or future behavior of any members of X or Y.
Two object references X and Y have equivalent state if, in a program which has not persisted the values returned from identity-related hash function, swapping all references to X with all references to Y would leave program state unchanged.
Note that the second definition is primarily relevant in the common scenario where two things hold a references to objects of some mutable type (e.g. arrays), but can be sure that, at least within some particular time-frame of interest, those objects are not going to be exposed to anything that might mutate them. In such a scenario, if the "holder" objects are equivalent in all other regards, their equivalence should depend upon whether the objects they hold meet the second definition of equivalence above.
Note that the second definition does not concern itself with any details of how an object's state might change. Note further that immutable objects could, for either definition of equivalence, report distinct objects with equal content as equal or unequal (if the only way in which X and Y differ is that X.Equals(X) reports true while X.Equals(Y) reports false, that would be a difference, but it would probably be most useful to have such objects use reference identity for the first equivalence relation and equivalence of other aspects for the second.
Unfortunately, because Java only provides one pair of equivalence-defining classes, a class designer must guess which definition of equivalence will be most relevant to consumers of the class. While there's a substantial argument to be made in favor of using the first always, the second is often more practically useful. The biggest problem with the second is that there's no way a class can know when code using the class will want the first equivalence relation.
equals 用于在 CopyOnWriteArraySet、HashSet 等集合中添加/删除元素(如果两个不同对象的 hashCode 相等)等。 equals 需要对称,即如果 B.equals(C) 返回 true,则 C.equals(B) 应返回相同的结果。否则,您对这些 XXXSet 的添加/删除行为会令人困惑。检查覆盖 CopyOnWriteArraySet.add 和删除的 equals 了解不正确覆盖的情况等于对集合的受影响的添加/删除操作
equals is used to add/remove elements from collections like CopyOnWriteArraySet, HashSet if hashCode is equal for two different objects, etc. equals need to be symmetric i.e. if B.equals(C) returns true then C.equals(B) should return the same result. Otherwise your add/remove on those XXXSets behave in a confusing manner. Check Overriding equals for CopyOnWriteArraySet.add and remove for how improper overriding of equals affected add/remove operations on collections