将对象的哈希码定义为所有类变量哈希码的和、乘法等是否不正确？

发布于 2024-08-30 21:37:26 字数 567 浏览 2 评论 0 原文

假设我有以下类：

class ABC {
    private int myInt = 1;
    private double myDouble = 2;
    private String myString = "123";
    private SomeRandomClass1 myRandomClass1 = new ...
    private SomeRandomClass2 myRandomClass2 = new ...

    //pseudo code
    public int myHashCode() {
        return 37 *
               myInt.hashcode() *
               myDouble.hashCode() *
               ... *
               myRandomClass.hashcode()
    }
}

这是 hashCode 的正确实现吗？这不是我通常这样做的方式（我倾向于遵循Effective Java 的指导方针），但我总是忍不住做一些类似上面代码的事情。

谢谢

原文

Let's say I have the following class:

class ABC {
    private int myInt = 1;
    private double myDouble = 2;
    private String myString = "123";
    private SomeRandomClass1 myRandomClass1 = new ...
    private SomeRandomClass2 myRandomClass2 = new ...

    //pseudo code
    public int myHashCode() {
        return 37 *
               myInt.hashcode() *
               myDouble.hashCode() *
               ... *
               myRandomClass.hashcode()
    }
}

Would this be a correct implementation of hashCode? This is not how I usually do it(I tend to follow Effective Java's guide-lines) but I always have the temptation to just do something like the above code.

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

那些过往 2024-09-06 21:37:26

这取决于你所说的“正确”是什么意思。假设您使用所有相关 equals() 定义字段的 hashCode()，那么是的，它是“正确的”。然而，这样的公式可能不会有良好的分布，因此可能会比其他公式引起更多的冲突，这将对性能产生不利影响。

这是来自 Effective Java 2nd Edition 第 9 条的引用：当您覆盖 equals 时，始终覆盖 hashCode

虽然本项中的配方产生了相当好的散列函数，但它并没有产生最先进的散列函数，并且 Java 平台库自版本 1.6 起也不提供此类散列函数。编写这样的哈希函数是一个研究课题，最好留给数学家和计算机科学家。 [...尽管如此，]本项中描述的技术对于大多数应用程序来说应该足够了。

评估您提出的哈希函数的好坏可能不需要大量的数学能力，但为什么还要麻烦呢？为什么不遵循那些已经被实践证明足够的东西呢？

Josh Bloch 的秘诀

将一些常量非零值（例如 17）存储在名为 result 的 int 变量中。
计算每个字段的 int 哈希码 c：
- 如果该字段是布尔值，则计算(f ? 1 : 0)
- 如果字段是byte、char、short、int，则计算(int) f
- 如果字段为long，则计算(int) (f ^ (f >>> 32))
- 如果字段是float，则计算Float.floatToIntBits(f)
- 如果字段是 double，则计算 Double.doubleToLongBits(f)，然后对结果 long 进行哈希处理，如上所示。李>
- 如果该字段是对象引用，并且此类的 equals 方法通过递归调用 equals 来比较该字段，则对该字段递归调用 hashCode 。如果该字段的值为null，则返回0。
- 如果该字段是一个数组，则将其视为每个元素都是一个单独的字段。如果数组字段中的每个元素都很重要，您可以使用版本 1.5 中添加的 Arrays.hashCode 方法之一。
将哈希码 c 组合到 result 中，如下所示： result = 31 * result + c;

现在，这个配方当然相当复杂，但幸运的是，您不必每次都重新实现它，这要归功于 java.util.Arrays.hashCode(Object[]) （和 com.google.common.base.Objects< /code> 提供了一个方便的 vararg 变体）。

@Override public int hashCode() {
    return Arrays.hashCode(new Object[] {
           myInt,    //auto-boxed
           myDouble, //auto-boxed
           myRandomClass,
    });
}

另请参见

对象。 hashCode()
<块引用>

不要求如果两个对象根据equals(java.lang.Object)方法不相等，则调用hashCode 两个对象中的每个对象上的方法必须产生不同的整数结果。 但是，程序员应该意识到，为不相等的对象生成不同的整数结果可能会提高哈希表的性能。

It depends what you mean by "correct". Assuming that you're using the hashCode() of all the relevant equals()-defining fields, then yes, it's "correct". However, such formulas probably will not have a good distribution, and therefore would likely cause more collisions than otherwise, which will have a detrimental effect on performance.

Here's a quote from Effective Java 2nd Edition, Item 9: Always override hashCode when you override equals

While the recipe in this item yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and computer scientists. [...Nonetheless,] the techniques described in this item should be adequate for most applications.

It may not require a lot of mathematical power to evaluate how good your proposed hash function is, but why even bother? Why not just follow something that has been anecdotally proven to be adequate in practice?

Josh Bloch's recipe

Store some constant nonzero value, say 17, in an int variable called result.
Compute an int hashcode c for each field:
- If the field is a boolean, compute (f ? 1 : 0)
- If the field is a byte, char, short, int, compute (int) f
- If the field is a long, compute (int) (f ^ (f >>> 32))
- If the field is a float, compute Float.floatToIntBits(f)
- If the field is a double, compute Double.doubleToLongBits(f), then hash the resulting long as in above.
- If the field is an object reference and this class's equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If the value of the field is null, return 0.
- If the field is an array, treat it as if each element is a separate field. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5.
Combine the hashcode c into result as follows: result = 31 * result + c;

Now, of course that recipe is rather complicated, but luckily, you don't have to reimplement it every time, thanks to java.util.Arrays.hashCode(Object[]) (and com.google.common.base.Objects provides a convenient vararg variant).

@Override public int hashCode() {
    return Arrays.hashCode(new Object[] {
           myInt,    //auto-boxed
           myDouble, //auto-boxed
           myRandomClass,
    });
}

Object.hashCode()

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.