使用异或的 GetHashCode() 问题

发布于 2024-07-24 13:58:58 字数 1465 浏览 12 评论 0原文

我的理解是,您通常应该将 xor 与 GetHashCode() 一起使用来生成 int,以通过其值(而不是通过其引用)来识别数据。 这是一个简单的例子:

class Foo
{
    int m_a;
    int m_b;

    public int A
    {
        get { return m_a; }
        set { m_a = value; }
    }

    public int B
    {
        get { return m_b; }
        set { m_b = value; }
    }

    public Foo(int a, int b)
    {
        m_a = a;
        m_b = b;
    }

    public override int GetHashCode()
    {
        return A ^ B;
    }

    public override bool Equals(object obj)
    {
        return this.GetHashCode() == obj.GetHashCode();
    }
}

我的想法是,我想根据属性 A 和 B 的值将 Foo 的一个实例与另一个实例进行比较。如果 Foo1.A == Foo2.A 且 Foo1.B == Foo2.B,那么我们有平等。

问题如下:

Foo one = new Foo(1, 2);
Foo two = new Foo(2, 1);

if (one.Equals(two)) { ... }  // This is true!

这两个函数都会为 GetHashCode() 生成值 3,从而导致 Equals() 返回 true。 显然,这是一个简单的示例,只有两个属性,我可以简单地比较 Equals() 方法中的各个属性。 然而,对于更复杂的类,这很快就会失控。

我知道有时只设置一次哈希码并始终返回相同的值是很有意义的。 然而,对于需要评估相等性的可变对象,我认为这是不合理的。

在实现 GetHashCode() 时处理可以轻松互换的属性值的最佳方法是什么?

另请参阅

什么是最佳算法重写 System.Object.GetHashCode?

My understanding is that you're typically supposed to use xor with GetHashCode() to produce an int to identify your data by its value (as opposed to by its reference). Here's a simple example:

class Foo
{
    int m_a;
    int m_b;

    public int A
    {
        get { return m_a; }
        set { m_a = value; }
    }

    public int B
    {
        get { return m_b; }
        set { m_b = value; }
    }

    public Foo(int a, int b)
    {
        m_a = a;
        m_b = b;
    }

    public override int GetHashCode()
    {
        return A ^ B;
    }

    public override bool Equals(object obj)
    {
        return this.GetHashCode() == obj.GetHashCode();
    }
}

The idea being, I want to compare one instance of Foo to another based on the value of properties A and B. If Foo1.A == Foo2.A and Foo1.B == Foo2.B, then we have equality.

Here's the problem:

Foo one = new Foo(1, 2);
Foo two = new Foo(2, 1);

if (one.Equals(two)) { ... }  // This is true!

These both produce a value of 3 for GetHashCode(), causing Equals() to return true. Obviously, this is a trivial example, and with only two properties I could simply compare the individual properties in the Equals() method. However, with a more complex class this would get out of hand quickly.

I know that sometimes it makes good sense to set the hash code only once, and always return the same value. However, for mutable objects where an evaluation of equality is necessary, I don't think this is reasonable.

What's the best way to handle property values that could easily be interchanged when implementing GetHashCode()?

See Also

What is the best algorithm for an overridden System.Object.GetHashCode?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

凉栀 2024-07-31 13:58:58

首先 - 不要仅根据 GetHashCode() 实现 Equals() - 即使对象不相等,哈希码有时也会发生冲突。

GetHashCode() 的契约包括以下内容:

  • 不同的哈希码意味着对象绝对不相等
  • 相同的哈希码意味着对象可能相等(但也可能不相等)

Andrew Hare 建议我合并他的答案:

我会推荐您阅读了此解决方案(顺便说一句,由我们自己的 Jon Skeet 编写),以获取“更好”的计算哈希码的方法。

不,上面的速度相对较慢并且
没有多大帮助。 有些人使用
XOR(例如 a ^ b ^ c),但我更喜欢
乔什·布洛赫(Josh Bloch)的方法中显示的一种方法
“有效的Java”:

public override int GetHashCode() 
  { 
      整数哈希 = 23; 
      哈希=哈希*37+起重机配重ID; 
      哈希 = 哈希*37 + 预告片ID; 
      散列=散列*37+craneConfigurationTypeCode.GetHashCode(); 
      返回哈希值; 
  } 
  

23和37是任意数字
它们是互质的。

上述相对于 XOR 的好处
方法是如果你有一个类型
它有两个值,分别是
经常相同,对这些进行异或
值总是给出相同的
结果 (0) 而上面的结果
区分它们,除非
你真是太不幸了。

正如上面的代码片段中提到的,您可能还想看看Joshua Bloch 的书,Effective Java, 其中包含对该主题的很好的处理(哈希码讨论也适用于 .NET)。

First off - Do not implement Equals() only in terms of GetHashCode() - hashcodes will sometimes collide even when objects are not equal.

The contract for GetHashCode() includes the following:

  • different hashcodes means that objects are definitely not equal
  • same hashcodes means objects might be equal (but possibly might not)

Andrew Hare suggested I incorporate his answer:

I would recommend that you read this solution (by our very own Jon Skeet, by the way) for a "better" way to calculate a hashcode.

No, the above is relatively slow and
doesn't help a lot. Some people use
XOR (eg a ^ b ^ c) but I prefer the
kind of method shown in Josh Bloch's
"Effective Java":

public override int GetHashCode()
{
    int hash = 23;
    hash = hash*37 + craneCounterweightID;
    hash = hash*37 + trailerID;
    hash = hash*37 + craneConfigurationTypeCode.GetHashCode();
    return hash;
}

The 23 and 37 are arbitrary numbers
which are co-prime.

The benefit of the above over the XOR
method is that if you have a type
which has two values which are
frequently the same, XORing those
values will always give the same
result (0) whereas the above will
differentiate between them unless
you're very unlucky.

As mentioned in the above snippet, you might also want to look at Joshua Bloch's book, Effective Java, which contains a nice treatment of the subject (the hashcode discussion applies to .NET as well).

小巷里的女流氓 2024-07-31 13:58:58

Andrew 发布了一个生成更好的哈希代码的好示例,但也要记住,您不应该使用哈希代码作为相等性检查,因为它们不能保证是唯一的。

举一个简单的例子来说明为什么这被认为是一个双重对象。 它比 int 有更多可能的值,因此不可能为每个 double 都有一个唯一的 int。 哈希实际上只是第一遍,用于像字典这样的情况,当您需要快速找到密钥时,通过首先比较哈希,可以排除很大一部分可能的密钥,并且只有具有匹配哈希的密钥才需要花费完全相等检查(或其他冲突解决方法)。

Andrew has posted a good example for generating a better hash code, but also bear in mind that you shouldn't use hash codes as an equality check, since they are not guaranteed to be unique.

For a trivial example of why this is consider a double object. It has more possible values than an int so it is impossible to have a unique int for each double. Hashes are really just a first pass, used in situations like a dictionary when you need to find the key quickly, by first comparing hashes a large percentage of the possible keys can be ruled out and only the keys with matching hashes need to have the expense of a full equality check (or other collision resolution methods).

小巷里的女流氓 2024-07-31 13:58:58

散列总是涉及冲突,你必须处理它(例如,比较散列值,如果它们相等,则精确比较类内的值以确保类相等)。

使用简单的异或,您会遇到很多冲突。 如果您想要更少,请使用一些数学函数将值分布在不同的位上(移位、与素数相乘等)。

Hashing always involves collisions and you have to deal with it (f.e., compare hash values and if they are equal, exactly compare the values inside the classes to be sure the classes are equal).

Using a simple XOR, you'll get many collisions. If you want less, use some mathematical functions that distribute values across different bits (bit shifts, multiplying with primes etc.).

深居我梦 2024-07-31 13:58:58

阅读重写可变对象的 GetHashCode? C# 并考虑实现 IEquatable

Read Overriding GetHashCode for mutable objects? C# and think about implementing IEquatable<T>

哑剧 2024-07-31 13:58:58

有几种更好的哈希实现。 例如 FNV 哈希

There are several better hash implementations. FNV hash for example.

随遇而安 2024-07-31 13:58:58

出于好奇,因为哈希码通常不是比较的好主意,所以只执行以下代码不是更好吗?还是我遗漏了一些东西?

public override bool Equals(object obj)
{
    bool isEqual = false;
    Foo otherFoo = obj as Foo;
    if (otherFoo != null)
    {
        isEqual = (this.A == otherFoo.A) && (this.B == otherFoo.B);
    }
    return isEqual;
}

Out of curiosity since hashcodes are typically a bad idea for comparison, wouldn't it be better to just do the following code, or am I missing something?

public override bool Equals(object obj)
{
    bool isEqual = false;
    Foo otherFoo = obj as Foo;
    if (otherFoo != null)
    {
        isEqual = (this.A == otherFoo.A) && (this.B == otherFoo.B);
    }
    return isEqual;
}
风为裳 2024-07-31 13:58:58

哈希的快速生成和良好分布

public override int GetHashCode()
{
    return A.GetHashCode() ^ B.GetHashCode();         // XOR
}

A quick generate and good distribution of hash

public override int GetHashCode()
{
    return A.GetHashCode() ^ B.GetHashCode();         // XOR
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文