GetHashCode扩展方法
在阅读了 StackOverflow 上有关重写 GetHashCode() 的所有问题和答案后,我编写了以下扩展方法,以便轻松方便地重写 GetHashCode()
:(
public static class ObjectExtensions
{
private const int _seedPrimeNumber = 691;
private const int _fieldPrimeNumber = 397;
public static int GetHashCodeFromFields(this object obj, params object[] fields) {
unchecked { //unchecked to prevent throwing overflow exception
int hashCode = _seedPrimeNumber;
for (int i = 0; i < fields.Length; i++)
if (fields[i] != null)
hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();
return hashCode;
}
}
}
我基本上只重构了有人在那里发布的代码,因为我真的很喜欢它可以普遍使用),
我这样使用:
public override int GetHashCode() {
return this.GetHashCodeFromFields(field1, field2, field3);
}
你看到这段代码有什么问题吗?
After reading all the questions and answers on StackOverflow concerning overriding GetHashCode()
I wrote the following extension method for easy and convenient overriding of GetHashCode()
:
public static class ObjectExtensions
{
private const int _seedPrimeNumber = 691;
private const int _fieldPrimeNumber = 397;
public static int GetHashCodeFromFields(this object obj, params object[] fields) {
unchecked { //unchecked to prevent throwing overflow exception
int hashCode = _seedPrimeNumber;
for (int i = 0; i < fields.Length; i++)
if (fields[i] != null)
hashCode *= _fieldPrimeNumber + fields[i].GetHashCode();
return hashCode;
}
}
}
(I basically only refactored the code that someone posted there, because I really like that it can be used generally)
which I use like this:
public override int GetHashCode() {
return this.GetHashCodeFromFields(field1, field2, field3);
}
Do you see any problems with this code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
这看起来是一个可靠的方法。
我唯一的建议是,如果您真的关心它的性能,您可能需要为几种常见情况添加通用版本(即可能 1-4 个参数)。 这样,对于这些对象(最有可能是小型的键样式复合对象),您将不需要构建数组以传递给方法、循环、任何泛型值装箱等的开销。调用语法将完全相同,但您将针对这种情况运行稍微优化的代码。 当然,在您决定是否值得进行维护权衡之前,我会对此进行一些性能测试。
像这样的事情:
That looks like a solid way to do it.
My only suggestion is that if you're really concerned about performance with it, you may want to add generic versions for several common cases (ie. probably 1-4 args). That way, for those objects (which are most likely to be small, key-style composite objects), you won't have the overhead of building the array to pass to the method, the loop, any boxing of generic values, etc. The call syntax will be exactly the same, but you'll run slightly more optimized code for that case. Of course, I'd run some perf tests over this before you decide whether it's worth the maintenance trade-off.
Something like this:
我不久前写了一些东西,你可能会解决你的问题......(实际上,它可能会被改进以包含你拥有的种子......)
无论如何,该项目被称为 Essence ( http://essence.codeplex.com/ ),它使用 System.Linq.Expression 库来生成(基于属性) Equals/GetHashCode/CompareTo/ToString 的标准表示,以及能够基于参数列表创建 IEqualityComparer 和 IComparer 类。 (我还有一些进一步的想法,但希望在继续进行更多操作之前获得一些社区反馈。)
(这意味着它几乎与手写一样快 - 主要的地方不是 CompareTo() ; 因为 Linq.Expressions 在 3.5 版本中没有变量的概念 - 因此当您没有获得匹配时,您必须在基础对象上调用 CompareTo() 两次。解决了这个问题。我想我可以使用emit il,但当时我并没有那么有灵感。)
这是一个非常简单的想法,但我以前没有见过它被完成。
现在的问题是,我对完善它失去了兴趣(这可能包括为 codeproject 写一篇文章,记录一些代码,等等),但如果你觉得这会是一些东西,我可能会被说服这样做出于兴趣。
(codeplex 网站没有可下载的包;只需转到源代码并获取它 - 哦,它是用 f# 编写的(尽管所有测试代码都是用 c# 编写的),因为那是我有兴趣学习的东西。)
无论如何,这里是项目中测试的 C# 示例:
无论如何,如果有人认为该项目值得,则需要完善,但想法就在那里......
I wrote some stuff a little while back that you might solve your problem... (And actually, it could probably be improved to include the seed that you have...)
Anyway, the project is called Essence ( http://essence.codeplex.com/ ), and it uses the System.Linq.Expression libraries to generate (based on attributes) standard representations of Equals/GetHashCode/CompareTo/ToString, as well as being able to create IEqualityComparer and IComparer classes based on an argument list. (I also have some further ideas, but would like to get some community feedback before continuing too much further.)
(What this means is that it's almost as fast as being handwritten - the main one where it isn't is the CompareTo(); cause the Linq.Expressions doesn't have the concept of a variable in the 3.5 release - so you have to call CompareTo() on the underlying object twice when you don't get a match. Using the DLR extensions to Linq.Expressions solves this. I suppose I could have used the emit il, but I wasn't that inspired at the time.)
It's quite a simple idea, but I haven't seen it done before.
Now the thing is, I kind of lost interest in polishing it (which would have included writing an article for codeproject, documenting some of the code, or the like), but I might be persuaded to do so if you feel it would be something of interest.
(The codeplex site doesn't have a downloadable package; just go to the source and grab that - oh, it's written in f# (although all the test code is in c#) as that was the thing I was interested in learning.)
Anyway, here is are c# example from the test in the project:
Anyway, the project, if anyone thinks is worthwhile, needs polishing, but the ideas are there...
我看起来不错,但只有一个问题:遗憾的是您必须使用
object[]
来传递值,因为这会将您发送到函数的任何值类型装箱。 我认为你没有太多选择,除非你像其他人建议的那样创建一些通用重载。I looks pretty good to me, I only have one issue: It is a shame that you have to use an
object[]
to pass in the values as this will box any value types you send to the function. I don't think you have much of a choice though, unless you go the route of creating some generic overloads like others have suggested.一般来说,您应该尽可能缩小
unchecked
的范围,尽管这在这里并不重要。 除此之外,看起来还不错。On general principle you should scope your
unchecked
as narrowly as you reasonably can, though it doesn't matter much here. Other than that, looks fine.(是的,我很迂腐,但这是我看到的唯一问题)
(yes, I'm very pedantic but this is the only problem that I see)
更优化:
这样做的优点是:
缺点:
More optimal:
The advantages of this are:
Disadvantages:
我应该指出,在实现 GetHashCode 时,您几乎不应该进行分配
(这里是 一些 有用 关于它的博客帖子)。
params 的工作方式(动态生成一个新数组)意味着这实际上不是一个好的通用解决方案。 您最好对每个字段使用方法调用,并将哈希状态作为传递给它们的变量来维护(这使得使用更好的哈希函数和雪崩也变得容易)。
I should point out that you should almost never do allocation while implementing GetHashCode
(here's some useful blog posts about it).
The way that
params
works (generating a new array on the fly) means this is really not a good general solution. You would be better using a method call per field and maintaiing the hash state as a variable passed to them (this makes it easy to use better hashing functions and avalanching too).除了使用 params object[] fields 产生的问题之外,我认为在某些情况下不使用类型信息也可能是一个性能问题。 假设两个类
A
、B
具有相同的类型和字段数量,并实现相同的接口I
。 现在,如果您将A
和B
对象放入具有相同字段和不同类型的Dictionary
对象中,结果将是相同的桶。 我可能会插入一些语句,例如hashCode ^= GetType().GetHashCode();
Jonathan Rupp 接受的答案处理 params 数组,但不处理值类型的装箱。 因此,如果性能非常重要,我可能会声明
GetHashCodeFromFields
不包含 object,而是包含int
参数,并且不发送字段本身,而是发送字段的哈希码。 IEApart from the problems arising from using
params object[] fields
, I think not using the type information may be a performance issue in some situations too. Suppose two classesA
,B
have the same type and number of fields and implement the same interfaceI
. Now if you putA
andB
objects to aDictionary<I, anything>
objects with equal fields and different types will end up in the same bucket. I'd probably insert some statement likehashCode ^= GetType().GetHashCode();
Jonathan Rupp's accepted answer deals with params array but do not deal with boxing of value types. So, if performance is very important I'd probably declare
GetHashCodeFromFields
having not object butint
parameters, and send not the fields themselves but the hash codes of the fields. i.e.可能出现的一个问题是,当乘法达到 0 时,最终的 hashCode 始终为 0,就像我刚刚经历过一个具有很多属性的对象一样,在下面的代码中:
我建议:
或者与 xor 类似的东西,例如 这个:
One problem that could arise is when multiplication hits 0, final hashCode is always 0, as I just experienced with an object with a lot of properties, in the following code :
I'd suggest :
Or something similar with xor like this :