hashCode 有何用途?它是独一无二的吗?
我注意到 WP7 中的每个控件、项目中都有一个 getHashCode() 方法,它返回一个数字序列。我可以使用此哈希码来唯一标识一个项目吗?
例如,我想识别设备上的图片或歌曲,并检查它的位置。如果为特定项目给出的哈希码是唯一的,则可以完成此操作。
你能帮我解释一下 hashCode 是什么,以及 getHashCode()
的用途吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
通过类比进行简单解释
在了解了它的全部内容之后(MSDN 文档对我来说有点太复杂),我想通过一个“故事”来简化它(希望)使其更容易理解。
摘要:什么是哈希码?
这是一个指纹。
它有什么用?我们可以使用此指纹来识别感兴趣的人。
您可以将哈希码视为我们试图唯一识别某人
我是一名侦探,正在寻找罪犯。让我们称他为残酷先生吧。 (当我还是个孩子的时候,他是一个臭名昭著的绑匪——他闯入一所房子,绑架并谋杀了一个可怜的女孩,然后扔掉了她的尸体。他仍然逍遥法外 - 但这是另一回事。残酷先生有一些独特的特征,我可以用这些特征在茫茫人海中唯一地识别出他。我们在澳大利亚有 2500 万人,其中之一就是残酷先生。我们如何才能找到他呢?
显然,残酷
先生的眼睛是蓝色的。澳大利亚人也有蓝眼睛。
识别残酷先生的好方法
我还可以使用什么?我会使用指纹!
优点:
哈希码和指纹
上述特征通常可以构成良好的哈希函数:对于给定的输入,我们想要一个唯一的输出 - 每次都有相同的输出;如果我们稍微改变输入,那么我们应该得到完全不同的输出。这个输出是“哈希码”。
那么什么是“碰撞”?
所以想象一下,如果我找到线索并且发现有人与残酷先生的指纹相匹配。这是否意味着我找到了残酷先生?
........也许!我必须仔细看看。如果我使用 SHA256(一种哈希函数)并且我正在一个只有 5 个人的小镇寻找 - 那么我很有可能找到他!但是,如果我使用 MD5(另一个著名的哈希函数)并在一个人口超过 2^1000 的城镇中检查指纹,那么两个完全不同的人很可能拥有相同的指纹。
那么这一切有什么好处呢?
哈希码的唯一真正好处是,如果您想将某些内容放入哈希表中 - 并且使用哈希表您希望快速找到对象 - 这就是哈希码的用处。它们使您可以非常快速地在哈希表中查找内容。这是一种可以极大提高性能的技巧,但会牺牲一点点准确性。
因此,让我们想象一下,我们有一个包含人员的哈希表 - 澳大利亚有 2500 万嫌疑人。残酷先生就在那里……我们怎样才能快速找到他??我们需要对所有这些进行分类:找到潜在的匹配者,或者以其他方式释放潜在的嫌疑人。您不想考虑每个人的独特特征,因为这会花费太多时间。你会用什么来代替?你会使用哈希码!哈希码可以告诉您两个人是否不同。乔·布洛格斯是否不是残酷先生。如果指纹不匹配,那么您就知道这绝对不是残酷先生。但是,如果指纹匹配,那么根据您使用的哈希函数,您很可能已经找到了您的男人。但这不是100%。您可以确定的唯一方法是进一步调查:(i) 他/她是否有机会/动机,(ii) 证人等。
当您使用计算机时,如果两个对象具有相同的内容哈希码值,那么您再次需要进一步调查它们是否真正相等。例如,您必须检查对象是否具有相同的高度、相同的重量等,整数是否相同,或者 customer_id 是否匹配,然后得出它们是否相同的结论。这通常可以通过实现 IComparer 或 IEquality 接口来完成。
关键摘要
所以基本上哈希码就是指纹。
指纹。或者换句话说,如果您有两个相同的指纹......那么它们不必都来自同一个人/物体。
相同的指纹。
脚注:
Simple Expalantion via Analogy
After learning what it is all about (MSDN documentation was a little too complex for me) I thought to simplify it via a "story" to (hopefully) make it easier to understand.
Summary: What is a hashcode?
It's a fingerprint.
What's it useful for? We can use this finger print to identify people of interest.
You can think of a Hashcode as us trying to To Uniquely Identify Someone
I am a detective, on the look out for a criminal. Let us call him Mr Cruel. (He was a notorious kidnapper when I was a kid -- he broke into a house, kidnapped, and murdered a poor girl, then dumped her body. He's still out on the loose - but that's a separate matter. Mr Cruel has certain peculiar characteristics that I can use to uniquely identify him amongst a sea of people. We have 25 million people in Australia. One of them is Mr Cruel. How can we find him?
Bad ways of Identifying Mr Cruel
Apparently Mr Cruel has blue eyes. That's not much help because almost half the population in Australia also has blue eyes.
Good ways of Identifying Mr Cruel
What else can i use? I know: I will use a fingerprint!
Advantages:
Hashcodes and Fingerprints
The above characteristics generally make for good hash functions: for a given input, we want a unique output - the same output every time; if we change the input a tiny bit, then we ought to get a completely different output. This output, is the 'hashcode'.
So then what's a 'Collision'?
So imagine if I get a lead and I find someone matching Mr Cruel's fingerprints. Does this mean I have found Mr Cruel?
........perhaps! I must take a closer look. If i am using SHA256 (a hashing function) and I am looking in a small town with only 5 people - then there is a very good chance I found him! But if I am using MD5 (another famous hashing function) and checking for fingerprints in a town with +2^1000 people, then it is a fairly good possibility that two entirely different people might have the same fingerprint.
So what is the benefit of all this anyways?
The only real benefit of hashcodes is if you want to put something in a hash table - and with hash tables you'd want to find objects quickly - and that's where the hash code comes in. They allow you to find things in hash tables really quickly. It's a hack that massively improves performance, but at a small expense of accuracy.
So let's imagine we have a hash table filled with people - 25 million suspects in Australia. Mr Cruel is somewhere in there..... How can we find him really quickly? We need to sort through them all: to find a potential match, or to otherwise acquit potential suspects. You don't want to consider each person's unique characteristics because that would take too much time. What would you use instead? You'd use a hashcode! A hashcode can tell you if two people are different. Whether Joe Bloggs is NOT Mr Cruel. If the prints don't match then you know it's definitely NOT Mr Cruel. But, if the finger prints do match then depending on the hash function you used, chances are already fairly good you found your man. But it's not 100%. The only way you can be certain is to investigate further: (i) did he/she have an opportunity/motive, (ii) witnesses etc etc.
When you are using computers if two objects have the same hash code value, then you again need to investigate further whether they are truly equal. e.g. You'd have to check whether the objects have e.g. the same height, same weight etc, if the integers are the same, or if the customer_id is a match, and then come to the conclusion whether they are the same. this is typically done perhaps by implementing an IComparer or IEquality interfaces.
Key Summary
So basically a hashcode is a finger print.
fingerprint. Or in other words, if you have two fingerprints that are the same.........then they need not both come from the same person/object.
same fingerprint.
Footnotes:
MSDN 说:
基本上,哈希码的存在是为了使哈希表成为可能。
保证两个相等的对象具有相等的哈希码。
两个不相等的对象不保证具有不相等的哈希码(这称为冲突)。
MSDN says:
Basically, hash codes exist to make hashtables possible.
Two equal objects are guaranteed to have equal hashcodes.
Two unequal objects are not guaranteed to have unequal hashcodes (that's called a collision).
GetHashCode()
用于帮助支持使用对象作为哈希表的键。 (Java等中也存在类似的事情)。目标是让每个对象返回一个不同的哈希码,但这通常不能绝对保证。尽管两个逻辑上相等的对象返回相同哈希码是必需的。典型的哈希表实现从 hashCode 值开始,采用模数(从而将值限制在一个范围内)并将其用作“桶”数组的索引。
GetHashCode()
is used to help support using the object as a key for hash tables. (A similar thing exists in Java etc). The goal is for every object to return a distinct hash code, but this often can't be absolutely guaranteed. It is required though that two logically equal objects return the same hash code.A typical hash table implementation starts with the hashCode value, takes a modulus (thus constraining the value within a range) and uses it as an index to an array of "buckets".
它并不是 WP7 独有的——它存在于所有 .Net 对象中。它有点像您所描述的那样,但我不建议将其作为应用程序中的唯一标识符,因为它不能保证是唯一的。
Object.GetHashCode 方法
It's not unique to WP7--it's present on all .Net objects. It sort of does what you describe, but I would not recommend it as a unique identifier in your apps, as it is not guaranteed to be unique.
Object.GetHashCode Method
这是来自此处的 msdn 文章:
https ://blogs.msdn.microsoft.com/tomarcher/2006/05/10/are-hash-codes-unique/
“虽然您会听到人们说哈希代码生成给定输入的唯一值,事实是,虽然很难实现,但找到散列到相同值的两个不同数据输入在技术上是可行的但是,真正的决定因素。哈希算法的有效性取决于生成的哈希码的长度和被哈希的数据的复杂性。”
因此,只需使用适合您的数据大小的哈希算法,它就会具有唯一的哈希码。
This is from the msdn article here:
https://blogs.msdn.microsoft.com/tomarcher/2006/05/10/are-hash-codes-unique/
"While you will hear people state that hash codes generate a unique value for a given input, the fact is that, while difficult to accomplish, it is technically feasible to find two different data inputs that hash to the same value. However, the true determining factors regarding the effectiveness of a hash algorithm lie in the length of the generated hash code and the complexity of the data being hashed."
So just use a hash algorithm suitable to your data size and it will have unique hashcodes.