C# 如何为违反 Equals 约定的类选择 Hashcode?

发布于 2024-07-18 07:52:30 字数 688 浏览 14 评论 0原文

我有多个类,由于某些原因,不遵循官方 Equals 合同。 在覆盖的 GetHashCode() 中,这些类仅返回 0,以便可以在 Hashmap 中使用它们。

其中一些类实现相同的接口,并且存在使用该接口作为键的哈希图。 所以我认为每个类至少应该在 GetHashCode() 中返回一个不同的(但仍然是恒定的)值。

问题是如何选择这个值。 我应该简单地让第一个类返回 1,下一个类返回 2 等等? 或者我应该尝试类似的方法

class SomeClass : SomeInterface {
    public overwrite int GetHashCode() {
        return "SomeClass".GetHashCode();
    }
}

,使哈希分布更均匀? (我是否必须自己缓存返回值,或者 Microsoft 的编译器能够对此进行优化吗?)

更新: 不可能为每个对象返回单独的哈希码,因为 Equals 违反了约定。 具体来说,我指的是 这个问题

I've got multiple classes that, for certain reasons, do not follow the official Equals contract. In the overwritten GetHashCode() these classes simply return 0 so they can be used in a Hashmap.

Some of these classes implement the same interface and there are Hashmaps using this interface as key. So I figured that every class should at least return a different (but still constant) value in GetHashCode().

The question is how to select this value. Should I simply let the first class return 1, the next class 2 and so on? Or should I try something like

class SomeClass : SomeInterface {
    public overwrite int GetHashCode() {
        return "SomeClass".GetHashCode();
    }
}

so the hash is distributed more evenly? (Do I have to cache the returned value myself or is Microsoft's compiler able to optimize this?)

Update: It is not possible to return an individual hashcode for each object, because Equals violates the contract. Specifially, I'm refering to this problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

烈酒灼喉 2024-07-25 07:52:30

如果它“违反了平等合同”,那么我不确定您是否应该将其用作密钥。

如果某个东西使用它作为键,您确实需要正确地进行哈希处理...目前还不清楚 Equals 逻辑是什么,但是两个被认为相等的值必须 具有相同的哈希码。 不要求具有相同哈希码的两个值相等。

使用常量字符串并没有多大帮助 - 你会得到在类型上均匀分割的值,但仅此而已......

If it "violates the Equals contract", then I'm not sure you should be using it as a key.

It something is using that as a key, you really need to get the hashing right... it is very unclear what the Equals logic is, but two values that are considered equal must have the same hash-code. It is not required that two values with the same hash-code are equal.

Using a constant string won't really help much - you'll get the values split evenly over the types, but that is about it...

陌上芳菲 2024-07-25 07:52:30

我很好奇重写 GetHashCode() 并返回常量值的原因是什么。 为什么要违反哈希的想法,而不是仅仅违反“契约”,根本不重写 GetHashCode() 函数并保留 Object 的默认实现?

编辑

如果您所做的是这样您就可以让对象根据其内容而不是引用进行匹配,那么您建议让不同的类简单地使用不同的常量就可以工作,但效率非常低。 您想要做的是提出一个哈希算法,该算法可以获取类的内容并生成一个平衡速度与均匀分布的值(即哈希 101)。

我想我不确定你在寻找什么......没有一个“好的”方案来为这个范例选择常数。 其中一个并不比另一个更好。 尝试改进您的对象,以便创建真正的哈希。

I'm curious what the reasoning would be for overriding GetHashCode() and returning a constant value. Why violate the idea of a hash rather than just violating the "contract" and not overriding the GetHashCode() function at all and leave the default implementation from Object?

Edit

If what you've done is that so you can have your objects match based on their contents rather than their reference then what you propose with having different classes simply use different constants can WORK, but is highly inefficient. What you want to do is come up with a hashing algorithm that can take the contents of your class and produce a value that balances speed with even distribution (that's hashing 101).

I guess I'm not sure what you're looking for...there isn't a "good" scheme for choosing constant numbers for this paradigm. One is not any better than the other. Try to improve your objects so that you're creating a real hash.

起风了 2024-07-25 07:52:30

我在编写向量类时遇到了这个问题。 我想比较向量是否相等,但浮点运算会产生舍入错误,所以我想要近似相等。 长话短说,除非您的实现是对称的、自反的和传递的,否则覆盖 equals 是一个坏主意。

其他类将假设 equals 具有这些属性,使用这些类的类也会如此,因此您可能会遇到奇怪的情况。 例如,列表可能会强制执行唯一性,但最终会得到两个元素,这些元素的计算结果等于某个元素 B。

哈希表是打破相等性时不可预测行为的完美示例。 例如:

//Assume a == b, b == c, but a != c
var T = new Dictionary<YourType, int>()
T[a] = 0
T[c] = 1
return T[b] //0 or 1? who knows!

另一个例子是 Set:

//Assume a == b, b == c, but a != c
var T = new HashSet<YourType>()
T.Add(a)
T.Add(c)
if (T.contains(b)) then T.remove(b)
//surely T can't contain b anymore! I sure hope no one breaks the properties of equality!
if (T.contains(b)) then throw new Exception()

我建议使用另一种方法,名称如 ApproxEquals。 您还可以考虑重写 == 运算符,因为它不是虚拟的,因此不会被其他类(如 Equals)意外使用。

如果您确实无法对哈希表使用引用相等,请不要破坏可以使用的情况的性能。 添加一个 IApproxEquals 接口,在类中实现它,并向 Dictionary 添加一个扩展方法 GetApprox,该方法枚举查找近似相等的键,并返回关联的值。 您还可以专门针对 3 维向量或任何您需要的内容编写自定义词典。

I ran into this exact problem when writing a vector class. I wanted to compare vectors for equality, but float operations give rounding errors, so I wanted approximate equality. Long story short, overriding equals is a bad idea unless your implementation is symmetric, reflexive, and transitive.

Other classes are going to assume equals has those properties, and so will classes using those classes, and so you can end up in weird cases. For example a list might enforce uniqueness, but end up with two elements which evaluate as equal to some element B.

A hash table is the perfect example of unpredictable behavior when you break equality. For example:

//Assume a == b, b == c, but a != c
var T = new Dictionary<YourType, int>()
T[a] = 0
T[c] = 1
return T[b] //0 or 1? who knows!

Another example would be a Set:

//Assume a == b, b == c, but a != c
var T = new HashSet<YourType>()
T.Add(a)
T.Add(c)
if (T.contains(b)) then T.remove(b)
//surely T can't contain b anymore! I sure hope no one breaks the properties of equality!
if (T.contains(b)) then throw new Exception()

I suggest using another method, with a name like ApproxEquals. You might also consider overriding the == operator, because it isn't virtual and therefore won't be used accidentally by other classes like Equals could be.

If you really can't use reference equality for the hash table, don't ruin the performance of cases where you can. Add an IApproxEquals interface, implement it in your class, and add an extension method GetApprox to Dictionary which enumerates the keys looking for an approximately equal one, and returns the associated value. You could also write a custom dictionary especially for 3-dimensional vectors, or whatever you need.

梦在深巷 2024-07-25 07:52:30

当发生哈希冲突时,哈希表/字典会调用 Equals 来查找您要查找的键。 使用恒定的哈希码首先消除了使用哈希的速度优势 - 它变成了线性搜索。

你是说 Equals 方法没有按照合同实现。 你这到底是什么意思? 根据违规类型,哈希表或字典只会变慢(线性搜索)或根本无法工作。

When hash collisions occur, the HashTable/Dictionary calls Equals to find the key you're looking for. Using a constant hash code removes the speed advantages of using a hash in the first place - it becomes a linear search.

You're saying the Equals method hasn't been implemented according to the contract. What exactly do you mean with this? Depending on the kind of violation, the HashTable or Dictionary will merely be slow (linear search) or not work at all.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文