GetHashCode扩展方法

发布于 2024-07-16 17:58:50 字数 906 浏览 6 评论 0原文

在阅读了 StackOverflow 上有关重写 GetHashCode() 的所有问题和答案后,我编写了以下扩展方法,以便轻松方便地重写 GetHashCode():(

public static class ObjectExtensions
{
    private const int _seedPrimeNumber = 691;
    private const int _fieldPrimeNumber = 397;
    public static int GetHashCodeFromFields(this object obj, params object[] fields) {
        unchecked { //unchecked to prevent throwing overflow exception
            int hashCode = _seedPrimeNumber;
            for (int i = 0; i < fields.Length; i++)
                if (fields[i] != null)
                    hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();
            return hashCode;
        }
    }
}

我基本上只重构了有人在那里发布的代码,因为我真的很喜欢它可以普遍使用),

我这样使用:

    public override int GetHashCode() {
        return this.GetHashCodeFromFields(field1, field2, field3);
    }

你看到这段代码有什么问题吗?

After reading all the questions and answers on StackOverflow concerning overriding GetHashCode() I wrote the following extension method for easy and convenient overriding of GetHashCode():

public static class ObjectExtensions
{
    private const int _seedPrimeNumber = 691;
    private const int _fieldPrimeNumber = 397;
    public static int GetHashCodeFromFields(this object obj, params object[] fields) {
        unchecked { //unchecked to prevent throwing overflow exception
            int hashCode = _seedPrimeNumber;
            for (int i = 0; i < fields.Length; i++)
                if (fields[i] != null)
                    hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();
            return hashCode;
        }
    }
}

(I basically only refactored the code that someone posted there, because I really like that it can be used generally)

which I use like this:

    public override int GetHashCode() {
        return this.GetHashCodeFromFields(field1, field2, field3);
    }

Do you see any problems with this code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

旧伤慢歌 2024-07-23 17:58:50

这看起来是一个可靠的方法。

我唯一的建议是,如果您真的关心它的性能,您可能需要为几种常见情况添加通用版本(即可能 1-4 个参数)。 这样,对于这些对象(最有可能是小型的键样式复合对象),您将不需要构建数组以传递给方法、循环、任何泛型值装箱等的开销。调用语法将完全相同,但您将针对这种情况运行稍微优化的代码。 当然,在您决定是否值得进行维护权衡之前,我会对此进行一些性能测试。

像这样的事情:

public static int GetHashCodeFromFields<T1,T2,T3,T4>(this object obj, T1 obj1, T2 obj2, T3 obj3, T4 obj4) {
    int hashCode = _seedPrimeNumber;
    if(obj1 != null)
        hashCode *= _fieldPrimeNumber + obj1.GetHashCode();
    if(obj2 != null)
        hashCode *= _fieldPrimeNumber + obj2.GetHashCode();
    if(obj3 != null)
        hashCode *= _fieldPrimeNumber + obj3.GetHashCode();
    if(obj4 != null)
        hashCode *= _fieldPrimeNumber + obj4.GetHashCode();
    return hashCode;
}

That looks like a solid way to do it.

My only suggestion is that if you're really concerned about performance with it, you may want to add generic versions for several common cases (ie. probably 1-4 args). That way, for those objects (which are most likely to be small, key-style composite objects), you won't have the overhead of building the array to pass to the method, the loop, any boxing of generic values, etc. The call syntax will be exactly the same, but you'll run slightly more optimized code for that case. Of course, I'd run some perf tests over this before you decide whether it's worth the maintenance trade-off.

Something like this:

public static int GetHashCodeFromFields<T1,T2,T3,T4>(this object obj, T1 obj1, T2 obj2, T3 obj3, T4 obj4) {
    int hashCode = _seedPrimeNumber;
    if(obj1 != null)
        hashCode *= _fieldPrimeNumber + obj1.GetHashCode();
    if(obj2 != null)
        hashCode *= _fieldPrimeNumber + obj2.GetHashCode();
    if(obj3 != null)
        hashCode *= _fieldPrimeNumber + obj3.GetHashCode();
    if(obj4 != null)
        hashCode *= _fieldPrimeNumber + obj4.GetHashCode();
    return hashCode;
}
只是一片海 2024-07-23 17:58:50

我不久前写了一些东西,你可能会解决你的问题......(实际上,它可能会被改进以包含你拥有的种子......)

无论如何,该项目被称为 Essence ( http://essence.codeplex.com/ ),它使用 System.Linq.Expression 库来生成(基于属性) Equals/GetHashCode/CompareTo/ToString 的标准表示,以及能够基于参数列表创建 IEqualityComparer 和 IComparer 类。 (我还有一些进一步的想法,但希望在继续进行更多操作之前获得一些社区反馈。)

(这意味着它几乎与手写一样快 - 主要的地方不是 CompareTo() ; 因为 Linq.Expressions 在 3.5 版本中没有变量的概念 - 因此当您没有获得匹配时,您必须在基础对象上调用 CompareTo() 两次。解决了这个问题。我想我可以使用emit il,但当时我并没有那么有灵感。)

这是一个非常简单的想法,但我以前没有见过它被完成。

现在的问题是,我对完善它失去了兴趣(这可能包括为 codeproject 写一篇文章,记录一些代码,等等),但如果你觉得这会是一些东西,我可能会被说服这样做出于兴趣。

(codeplex 网站没有可下载的包;只需转到源代码并获取它 - 哦,它是用 f# 编写的(尽管所有测试代码都是用 c# 编写的),因为那是我有兴趣学习的东西。)

无论如何,这里是项目中测试的 C# 示例:

    // --------------------------------------------------------------------
    // USING THE ESSENCE LIBRARY:
    // --------------------------------------------------------------------
    [EssenceClass(UseIn = EssenceFunctions.All)]
    public class TestEssence : IEquatable<TestEssence>, IComparable<TestEssence>
    {
        [Essence(Order=0] public int MyInt           { get; set; }
        [Essence(Order=1] public string MyString     { get; set; }
        [Essence(Order=2] public DateTime MyDateTime { get; set; }

        public override int GetHashCode()                                { return Essence<TestEssence>.GetHashCodeStatic(this); }
    ...
    }

    // --------------------------------------------------------------------
    // EQUIVALENT HAND WRITTEN CODE:
    // --------------------------------------------------------------------
    public class TestManual
    {
        public int MyInt;
        public string MyString;
        public DateTime MyDateTime;

        public override int GetHashCode()
        {
            var x = MyInt.GetHashCode();
            x *= Essence<TestEssence>.HashCodeMultiplier;
            x ^= (MyString == null) ? 0 : MyString.GetHashCode();
            x *= Essence<TestEssence>.HashCodeMultiplier;
            x ^= MyDateTime.GetHashCode();
            return x;
        }
    ...
    }

无论如何,如果有人认为该项目值得,则需要完善,但想法就在那里......

I wrote some stuff a little while back that you might solve your problem... (And actually, it could probably be improved to include the seed that you have...)

Anyway, the project is called Essence ( http://essence.codeplex.com/ ), and it uses the System.Linq.Expression libraries to generate (based on attributes) standard representations of Equals/GetHashCode/CompareTo/ToString, as well as being able to create IEqualityComparer and IComparer classes based on an argument list. (I also have some further ideas, but would like to get some community feedback before continuing too much further.)

(What this means is that it's almost as fast as being handwritten - the main one where it isn't is the CompareTo(); cause the Linq.Expressions doesn't have the concept of a variable in the 3.5 release - so you have to call CompareTo() on the underlying object twice when you don't get a match. Using the DLR extensions to Linq.Expressions solves this. I suppose I could have used the emit il, but I wasn't that inspired at the time.)

It's quite a simple idea, but I haven't seen it done before.

Now the thing is, I kind of lost interest in polishing it (which would have included writing an article for codeproject, documenting some of the code, or the like), but I might be persuaded to do so if you feel it would be something of interest.

(The codeplex site doesn't have a downloadable package; just go to the source and grab that - oh, it's written in f# (although all the test code is in c#) as that was the thing I was interested in learning.)

Anyway, here is are c# example from the test in the project:

    // --------------------------------------------------------------------
    // USING THE ESSENCE LIBRARY:
    // --------------------------------------------------------------------
    [EssenceClass(UseIn = EssenceFunctions.All)]
    public class TestEssence : IEquatable<TestEssence>, IComparable<TestEssence>
    {
        [Essence(Order=0] public int MyInt           { get; set; }
        [Essence(Order=1] public string MyString     { get; set; }
        [Essence(Order=2] public DateTime MyDateTime { get; set; }

        public override int GetHashCode()                                { return Essence<TestEssence>.GetHashCodeStatic(this); }
    ...
    }

    // --------------------------------------------------------------------
    // EQUIVALENT HAND WRITTEN CODE:
    // --------------------------------------------------------------------
    public class TestManual
    {
        public int MyInt;
        public string MyString;
        public DateTime MyDateTime;

        public override int GetHashCode()
        {
            var x = MyInt.GetHashCode();
            x *= Essence<TestEssence>.HashCodeMultiplier;
            x ^= (MyString == null) ? 0 : MyString.GetHashCode();
            x *= Essence<TestEssence>.HashCodeMultiplier;
            x ^= MyDateTime.GetHashCode();
            return x;
        }
    ...
    }

Anyway, the project, if anyone thinks is worthwhile, needs polishing, but the ideas are there...

冷清清 2024-07-23 17:58:50

我看起来不错,但只有一个问题:遗憾的是您必须使用 object[] 来传递值,因为这会将您发送到函数的任何值类型装箱。 我认为你没有太多选择,除非你像其他人建议的那样创建一些通用重载。

I looks pretty good to me, I only have one issue: It is a shame that you have to use an object[] to pass in the values as this will box any value types you send to the function. I don't think you have much of a choice though, unless you go the route of creating some generic overloads like others have suggested.

伊面 2024-07-23 17:58:50

一般来说,您应该尽可能缩小unchecked的范围,尽管这在这里并不重要。 除此之外,看起来还不错。

On general principle you should scope your unchecked as narrowly as you reasonably can, though it doesn't matter much here. Other than that, looks fine.

非要怀念 2024-07-23 17:58:50
public override int GetHashCode() {
    return this.GetHashCodeFromFields(field1, field2, field3, this);
}

(是的,我很迂腐,但这是我看到的唯一问题)

public override int GetHashCode() {
    return this.GetHashCodeFromFields(field1, field2, field3, this);
}

(yes, I'm very pedantic but this is the only problem that I see)

梦里寻她 2024-07-23 17:58:50

更优化:

  1. 创建一个代码生成器,该生成器使用反射来查看业务对象字段,并创建一个覆盖 GetHashCode()(和 Equals())的新分部类。
  2. 当程序在调试模式下启动时运行代码生成器,如果代码已更改,则退出并向开发人员发送一条消息以重新编译。

这样做的优点是:

  • 使用反射,您可以知道哪些字段是值类型,从而知道它们是否需要 null 检查。
  • 没有开销 - 没有额外的函数调用,没有列表构造等。如果您要进行大量字典查找,这一点很重要。
  • 长实现(在具有大量字段的类中)隐藏在部分类中,远离重要的业务代码。

缺点:

  • 如果你不进行大量的字典查找/调用 GetHashCode(),那就太过分了。

More optimal:

  1. Create a code generator that uses reflection to look through your business object fields and creates a new partial class which overrides GetHashCode() (and Equals()).
  2. Run the code generator when your program starts up in debug mode, and if the code has changed, exit with a message to the developer to recompile.

The advantages of this are:

  • Using reflection you know which fields are value types or not, and hence whether they need null checks.
  • There are no overheads - no extra function calls, no list construction, etc. This is important if you are doing lots of dictionary lookups.
  • Long implementations (in classes with lots of fields) are hidden in partial classes, away from your important business code.

Disadvantages:

  • Overkill if you don't do lots of dictionary lookups/calls to GetHashCode().
∞梦里开花 2024-07-23 17:58:50

我应该指出,在实现 GetHashCode 时,您几乎不应该进行分配
(这里是 一些 有用 关于它的博客帖子)。

params 的工作方式(动态生成一个新数组)意味着这实际上不是一个好的通用解决方案。 您最好对每个字段使用方法调用,并将哈希状态作为传递给它们的变量来维护(这使得使用更好的哈希函数和雪崩也变得容易)。

I should point out that you should almost never do allocation while implementing GetHashCode
(here's some useful blog posts about it).

The way that params works (generating a new array on the fly) means this is really not a good general solution. You would be better using a method call per field and maintaiing the hash state as a variable passed to them (this makes it easy to use better hashing functions and avalanching too).

盛装女皇 2024-07-23 17:58:50

除了使用 params object[] fields 产生的问题之外,我认为在某些情况下不使用类型信息也可能是一个性能问题。 假设两个类AB具有相同的类型和字段数量,并实现相同的接口I。 现在,如果您将 AB 对象放入具有相同字段和不同类型的 Dictionary 对象中,结果将是相同的桶。 我可能会插入一些语句,例如 hashCode ^= GetType().GetHashCode();

Jonathan Rupp 接受的答案处理 params 数组,但不处理值类型的装箱。 因此,如果性能非常重要,我可能会声明 GetHashCodeFromFields 不包含 object,而是包含 int 参数,并且不发送字段本身,而是发送字段的哈希码。 IE

public override int GetHashCode() 
{
    return this.GetHashCodeFromFields(field1.GetHashCode(), field2.GetHashCode());
}

Apart from the problems arising from using params object[] fields, I think not using the type information may be a performance issue in some situations too. Suppose two classes A, B have the same type and number of fields and implement the same interface I. Now if you put A and B objects to a Dictionary<I, anything> objects with equal fields and different types will end up in the same bucket. I'd probably insert some statement like hashCode ^= GetType().GetHashCode();

Jonathan Rupp's accepted answer deals with params array but do not deal with boxing of value types. So, if performance is very important I'd probably declare GetHashCodeFromFields having not object but int parameters, and send not the fields themselves but the hash codes of the fields. i.e.

public override int GetHashCode() 
{
    return this.GetHashCodeFromFields(field1.GetHashCode(), field2.GetHashCode());
}
把回忆走一遍 2024-07-23 17:58:50

可能出现的一个问题是,当乘法达到 0 时,最终的 hashCode 始终为 0,就像我刚刚经历过一个具有很多属性的对象一样,在下面的代码中:

hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();

我建议:

hashCode = hashCode * _fieldPrimeNumber + fields[i].GetHashCode();

或者与 xor 类似的东西,例如 这个

hashCode = hashCode * _fieldPrimeNumber ^ fields[i].GetHashCode();

One problem that could arise is when multiplication hits 0, final hashCode is always 0, as I just experienced with an object with a lot of properties, in the following code :

hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();

I'd suggest :

hashCode = hashCode * _fieldPrimeNumber + fields[i].GetHashCode();

Or something similar with xor like this :

hashCode = hashCode * _fieldPrimeNumber ^ fields[i].GetHashCode();
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文