我应该如何实现 Object.GetHashCode() 以获得复杂的相等性?

发布于 2024-07-26 19:17:23 字数 1329 浏览 18 评论 0原文

基本上,到目前为止我有以下内容:

class Foo {
    public override bool Equals(object obj)
    {
        Foo d = obj as Foo ;
        if (d == null)
            return false;

        return this.Equals(d);
    }

    #region IEquatable<Foo> Members

    public bool Equals(Foo other)
    {
        if (this.Guid != String.Empty && this.Guid == other.Guid)
            return true;
        else if (this.Guid != String.Empty || other.Guid != String.Empty)
            return false;

        if (this.Title == other.Title &&
            this.PublishDate == other.PublishDate &&
            this.Description == other.Description)
            return true;

        return false;
    }
}

所以,问题是这样的:我有一个非必填字段Guid,它是一个唯一标识符。 如果未设置,那么我需要尝试根据不太准确的指标来确定相等性,以尝试确定两个对象是否相等。 这工作正常,但它让 GetHashCode() 变得混乱......我该怎么办? 一个幼稚的实现是这样的:

public override int GetHashCode() {
    if (this.Guid != String.Empty)
        return this.Guid.GetHashCode();

    int hash = 37;
    hash = hash * 23 + this.Title.GetHashCode();
    hash = hash * 23 + this.PublishDate.GetHashCode();
    hash = hash * 23 + this.Description.GetHashCode();
    return hash;
}

但是两种类型的哈希冲突的可能性有多大? 当然,我不会期望它是 1 in 2 ** 32。 这是一个坏主意吗?如果是的话,我应该怎么做?

Basically, I have the following so far:

class Foo {
    public override bool Equals(object obj)
    {
        Foo d = obj as Foo ;
        if (d == null)
            return false;

        return this.Equals(d);
    }

    #region IEquatable<Foo> Members

    public bool Equals(Foo other)
    {
        if (this.Guid != String.Empty && this.Guid == other.Guid)
            return true;
        else if (this.Guid != String.Empty || other.Guid != String.Empty)
            return false;

        if (this.Title == other.Title &&
            this.PublishDate == other.PublishDate &&
            this.Description == other.Description)
            return true;

        return false;
    }
}

So, the problem is this: I have a non-required field Guid, which is a unique identifier. If this isn't set, then I need to try to determine equality based on less accurate metrics as an attempt at determining if two objects are equal. This works fine, but it make GetHashCode() messy... How should I go about it? A naive implementation would be something like:

public override int GetHashCode() {
    if (this.Guid != String.Empty)
        return this.Guid.GetHashCode();

    int hash = 37;
    hash = hash * 23 + this.Title.GetHashCode();
    hash = hash * 23 + this.PublishDate.GetHashCode();
    hash = hash * 23 + this.Description.GetHashCode();
    return hash;
}

But what are the chances of the two types of hash colliding? Certainly, I wouldn't expect it to be 1 in 2 ** 32. Is this a bad idea, and if so, how should I be doing it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

予囚 2024-08-02 19:17:24

一个非常简单的自定义类的哈希代码方法是按位异或每个字段的哈希代码在一起。 它可以像这样简单:

int hash = 0;
hash ^= this.Title.GetHashCode();
hash ^= this.PublishDate.GetHashCode();
hash ^= this.Description.GetHashCode();
return hash;

从上面的链接

XOR 具有以下优良特性:

  • 它不依赖于计算顺序。
  • 它不会“浪费”比特。 如果您更改其中一个组件中的哪怕一位,最终值都会发生变化。
  • 速度很快,即使在最原始的计算机上也只需一个周期。
  • 它保持均匀分布。 如果你组合的两块是均匀分布的,那么组合也会是均匀分布的。 换句话说,它不会将摘要的范围压缩成更窄的范围。

如果您希望字段中存在重复值,则异或效果不佳,因为异或时重复值会相互抵消。 由于您将三个不相关的字段散列在一起,因此在这种情况下应该不是问题。

A very easy hash code method for custom classes is to bitwise XOR each of the fields' hash codes together. It can be as simple as this:

int hash = 0;
hash ^= this.Title.GetHashCode();
hash ^= this.PublishDate.GetHashCode();
hash ^= this.Description.GetHashCode();
return hash;

From the link above:

XOR has the following nice properties:

  • It does not depend on order of computation.
  • It does not “waste” bits. If you change even one bit in one of the components, the final value will change.
  • It is quick, a single cycle on even the most primitive computer.
  • It preserves uniform distribution. If the two pieces you combine are uniformly distributed so will the combination be. In other words, it does not tend to collapse the range of the digest into a narrower band.

XOR doesn't work well if you expect to have duplicate values in your fields as duplicate values will cancel each other out when XORed. Since you're hashing together three unrelated fields that should not be a problem in this case.

辞别 2024-08-02 19:17:24

我认为您选择使用的方法没有问题。 “过度”担心哈希冲突几乎总是表明对问题思考过度; 只要哈希很可能不同,你就应该没问题。

最终,如果可以合理地预期大多数情况下可以根据对象的标题和出版日期(书籍?)来区分对象,那么您甚至可能需要考虑从哈希中删除 Description

您甚至可以考虑完全忽略哈希函数中的 GUID,而仅在 Equals 实现中使用它来消除不太可能(?)的哈希冲突情况。

I don't think there is a problem with the approach you have chosen to use. Worrying 'too much' about hash collisions is almost always an indication of over-thinking the problem; as long as the hash is highly likely to be different you should be fine.

Ultimately you may even want to consider leaving out the Description from your hash anyway if it is reasonable to expect that most of the time objects can be distinguished based on their title and publication date (books?).

You could even consider disregarding the GUID in your hash function altogether, and only use it in the Equals implementation to disambiguate the unlikely(?) case of hash clashes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文