如何使用 equals() 和 hashCode() 实现近似几何相等
我想实现一些具有数值鲁棒性的几何算法。
为此,使用系统范围的delta
来实现几何相等。点的 equals()
是通过使用 delta
近似相等的距离计算来实现的。
我希望能够使用常规的 java 集合,例如 Set
。但我无法想出一个合理的 hashCode()
实现。
我的猜测是,高效使用 HashSet 的实现将导致具有“软”边界的空间分区。与分区边界距离小于 delta
的点应该能够同时分类到最多八个(对于 3D)相邻区域。距离足够近以被视为相等但位于分区不同侧的点否则将被“错误分类”。
这是我无法理解的。 hashCode()
就像将项目放入桶中,其中单个项目最终放入单个桶中,而我需要将其放入最多八个。
什么是合理的解决方案?我是否滥用了 hashCode() 的用途?仍然使用 hashCode()
的最合理的解决方案是什么:)
编辑: 谢谢,我有一种直觉,这个想法有问题,但无法提出我的手指放在上面。你把事情说得很清楚了,
请让我将我的问题扩展为:如果我可以接受较慢的 HashSet
操作(这不是一个阻碍),我可以使用 hashCode()
返回 1
因为在我的例子中没有正确的实现,如果我执行 equals()
删除传递性要求?
I would like to implement some geometrical algorithms with numerical robustness.
For this, a system-wide delta
is used for geometric equality. equals()
for points is implemented with a distance computation using the delta
for approximate equality.
I would like to be able to use regular java Collections, e.g Set
. But I can't come up with a reasonable hashCode()
implementation.
My guess is that an implementation for an efficient HashSet
use would result in a space partitioning with "soft" borders. Points with distance less that delta
to the partition border should be able to be classified in up to eight (for 3D) adjacent regions at the same time. Points close enough to be considered equal on behalf of their distance but lying on different sides of the partition would otherwise be "misclassified".
This is what I can't get my head around. hashCode()
is like putting items in buckets with a single item ending up in a single bucket, while I would need to put it in up to eight.
What would be a resonable solution? Am I abusing the purpose of hashCode()
? And what would be the most reasonable solution still using hashCode()
:)
EDIT: Thank you, I had an intuition that there was something wrong with the idea but could not put my finger on it. You made the matter very clear
Please let me extend my question to : if I'm fine with slower HashSet
operation (which is not a showstopper), I could have hashCode()
return 1
as there is no correct implementation in my case, what dire consequences would there be (it terms of geometric computations), if I do implement equals()
dropping the transitivity requirement?
EDIT I have found this post, highlighting the issues with missing transitivity and and this post, which is closely related to this one.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于非
equals()
对象,hashCode()
可以相等。所以,事实上,桶装可能就很好。例如,如果您只使用最近“网格”点的某个哈希函数,您可以在其中任意定义该网格,只要您一致且正确地定义舍入,它就可以作为哈希函数工作。但是,您不会从第一个定义中获得
equals()
的正确概念。如果 A 和 B、B 和 C 彼此在 delta 范围内,这并不意味着 A 和 C 也在范围内。传递性不成立。您可以根据分桶定义
equals()
。当点接近但线的另一侧不相等时,它可能会给出稍微令人惊讶的结果。hashCode()
can be equal for un-equals()
objects. So, indeed, bucketing is probably just fine. For example, if you just use some hash function of the nearest "grid" point, where you can define that grid however you want, it will work as a hash function as long as you define the rounding consistently and correctly.However, you are not going to get a correct notion of
equals()
from your first definition. If A and B, and B and C, are within delta of one another, this does not mean that A and C are. Transitivity does not hold.You could define
equals()
in terms of bucketing. It may give slightly surprising results as points close but on the other side of a line are not equal.这根本上是不可能的。
基于
equals()
/hashCode()
的相等只能在数学 等价关系遵守三个规则:您的定义不具有传递性,因此您不能使用它。
相反,您需要一个替代的数据结构。
This is fundamentally impossible.
equals()
/hashCode()
-based equality can only be defined on mathematical equivalence relations that obey three rules:Your definition is not transitive, so you cannot use it.
Instead, you need an alternate data structure.
我建议反对这整个方法。一方面,它会破坏
equals
的传递属性。 (如果 delta == 0.1,则 1 == 1.1 且 1.1 == 1.2,但 1 != 1.2。)没有哈希码实现可以解决这个问题。一般来说,它也可能使集合框架变得混乱。I would recommend against this entire approach. For one thing, it will break the transitive property of
equals
. (If delta == 0.1, then 1 == 1.1 and 1.1 == 1.2, but 1 != 1.2.) No hashcode implementation is going to fix that. It's also likely to make a mess of the collections framework in general.