适合小班授课的好哈希吗? (覆盖 GetHashCode)

发布于 2024-09-08 08:11:34 字数 127 浏览 9 评论 0原文

我使用一些包含 1-2 个整数的身份类/结构,也可能是日期时间或小字符串。我用它们作为字典中的键。

对于这样的事情,对 GetHashCode 的一个好的重写是什么?事情很简单,但仍然有一定的性能希望。

谢谢

I use some identity classes/structs that contains 1-2 ints, maybe a datetime or a small string as well. I use these as keys in a dictionary.

What would be a good override of GetHashCode for something like this? Something quite simple but still somewhat performant hopefully.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

风追烟花雨 2024-09-15 08:11:34

查看C# 基础

它包含有关如何正确覆盖 GetHashCode() 的详细说明。

摘自书中

哈希码的目的是通过生成与对象值相对应的数字来有效地平衡哈希表

  • 必需:相等的对象必须具有相等的哈希码(如果 a.Equals(b),则 a.GetHashCode() == b.GetHashCode( )
  • 必需GetHashCode() 在特定对象的生命周期内的返回值应该是恒定的(相同的值),即使对象的数据发生变化。在许多情况下,您应该缓存方法返回来强制执行此操作。
  • 必需: GetHashCode() 不应抛出任何异常; GetHashCode() 必须始终成功返回一个值。
  • 性能:哈希代码应尽可能唯一。然而,由于哈希码仅返回一个 int,因此对于具有可能比 int 所能容纳的值更多的值的对象(几乎所有类型),哈希码必须存在重叠。 (一个明显的例子是long,因为可能的long值比int可以唯一识别的要多。)
  • 性能:可能的哈希码值应均匀分布在 int 的范围内。例如,创建一个散列时不考虑拉丁语言中字符串的分布主要集中在最初的 128 个 ASCII 字符这一事实,将导致字符串值的分布非常不均匀,并且不会是一个强大的 GetHashCode()算法。
  • 性能: GetHashCode() 应针对性能进行优化。 GetHashCode() 通常用在 Equals() 实现中,以便在哈希码不同时缩短完整的相等比较。因此,当该类型用作字典集合中的键类型时,会频繁调用它。
  • 性能:两个对象之间的微小差异应该会导致哈希码值之间存在较大差异 - 理想情况下,对象中的 1 位差异会导致大约 16 位的哈希码发生变化,平均的。这有助于确保哈希表保持平衡,无论它如何“存储”哈希值。
  • 安全性:攻击者应该很难制作具有特定哈希码的对象。该攻击是用大量数据淹没哈希表,所有数据都哈希为相同的值。哈希表实现的时间复杂度从 O(1) 变为 O(n),从而导致可能的拒绝服务攻击。

正如这里已经提到的,您还必须考虑有关重写 Equals() 的一些要点,并且有一些代码示例展示了如何实现这两个函数。

因此,这些信息应该提供一个起点,但我建议购买这本书并阅读完整的第 9 章(至少前十二部分),以获得有关如何正确实现这两个关键功能的所有要点。

Take a look into Essential C#.

It contains a detailed description on how to overwrite GetHashCode() correctly.

Extract from the book

The purpose of the hash code is to efficiently balance a hash table by generating a number that corresponds to the value of an object.

  • Required: Equal objects must have equal hash codes (if a.Equals(b), then a.GetHashCode() == b.GetHashCode())
  • Required: GetHashCode()'s returns over the life of a particular object should be constant (the same value), even if the object's data changes. In many cases, you should cache the method return to enforce this.
  • Required: GetHashCode() should not throw any exceptions; GetHashCode() must always successfully return a value.
  • Performance: Hash codes should be unique whenever possible. However, since hash code return only an int, there has to be an overlap in hash codes for objects that have potentially more values than an int can hold -- virtually all types. (An obvious example is long, since there are more possible long values than an int could uniquely identify.)
  • Performance: The possible hash code values should be distributed evenly over the range of an int. For example, creating a hash that doesn't consider the fact that distribution of a string in Latin-based languages primarily centers on the initial 128 ASCII characters would result in a very uneven distribution of string values and would not be a strong GetHashCode() algorithm.
  • Performance: GetHashCode() should be optimized for performance. GetHashCode() is generally used in Equals() implementations to short-circuit a full equals comparison if the hash codes are different. As a result, it is frequently called when the type is used as a key type in dictionary collections.
  • Performance: Small differences between two objects should result in large differences between hash codes values -- ideally, a 1-bit difference in the object results in around 16 bits of the hash code changing, on average. This helps ensure that the hash table remains balanced no matter how it is "bucketing" the hash values.
  • Security: It should be difficult for an attacker to craft an object that has a particular hash code. The attack is to flood a hash table with large amounts of data that all hash to the same value. The hash table implementation then becomes O(n) instead of O(1), resulting in a possible denial-of-service attack.

As already mentioned here you have also to consider some points about overriding Equals() and there are some code examples showing how to implement these two functions.

So these informations should give a starting point but i recommend to buy the book and to read the complete chapter 9 (at least the first twelve sides) to get all the points on how to correctly implement these two crucial functions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文