证明:为什么 java.lang.String.hashCode() 的实现与其文档相符?
java 的 JDK 文档.lang.String.hashCode()
著名说:
字符串对象的哈希码计算如下
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
使用
int
算术,其中s[i]
是字符串的第*i
*个字符,n 是字符串的长度,
^
表示求幂。
这个表达式的标准实现是:
int hash = 0;
for (int i = 0; i < length; i++)
{
hash = 31*hash + value[i];
}
return hash;
看着这个让我感觉我在算法课程中睡觉。 这个数学表达式如何转化为上面的代码?
The JDK documentation for java.lang.String.hashCode()
famously says:
The hash code for a String object is computed as
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using
int
arithmetic, wheres[i]
is the *i
*th character of the string,n
is the length of the string, and^
indicates exponentiation.
The standard implementation of this expression is:
int hash = 0;
for (int i = 0; i < length; i++)
{
hash = 31*hash + value[i];
}
return hash;
Looking at this makes me feel like I was sleeping through my algorithms course. How does that mathematical expression translate into the code above?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
展开循环。 然后你会得到:
现在你可以做一些数学运算,插入 0 作为初始哈希值:
进一步简化:
这本质上就是给出的原始算法。
unroll the loop. Then you get:
Now you can do some mathematical manipulation, plug in 0 for the initial hash value:
Simplify it some more:
And that is essentially the original algorithm given.
我不确定您是否错过了该文档中“^ 表示求幂”(不是异或)的地方。
每次循环时,前一个哈希值都会再次乘以 31,然后再添加到
value
的下一个元素。人们可以通过归纳法证明这些东西是相等的,但我认为一个例子可能更合适
明确:
假设我们正在处理一个 4 字符的字符串。 让我们展开循环:
现在通过将哈希的每个值替换为以下语句,将它们组合成一个语句:
31 * 0 是 0,因此简化:
现在将两个内部项乘以第二个 31:
现在将三个内部项乘以该值第一个 31:
并转换为指数(不再是真正的 Java):
I'm not sure if you missed where it says "^ indicates exponentiation" (not xor) in that documentation.
Each time through the loop, the previous value of hash is multipled by 31 again before being added to the next element of
value
.One could prove these things are equal by induction, but I think an example might be more
clear:
Say we're dealing with a 4-char string. Let's unroll the loop:
Now combine these into one statement by substituting each value of hash into the following statement:
31 * 0 is 0, so simplify:
Now multiply the two inner terms by that second 31:
Now multiply the three inner terms by that first 31:
and convert to exponents (not really Java anymore):
归纳证明:
我想我有它,并且要求提供证明。
Proof by induction:
I think I have it, and a proof was requested.
看一下前几次迭代,您会发现模式开始出现:
Take a look at the first few iterations and you'll see the pattern start to emerge:
从所有字符中计算字符串的哈希码不是毫无用处吗? 想象一下文件名或类名及其完整路径放入 HashSet 中。 或者有人使用字符串文档的哈希集而不是列表,因为“哈希集总是击败列表”。
我会做类似的事情:
最后,哈希码只不过是一个提示。
Isn't it useless at all to count the hashcode of the String out of all characters? Imagine filenames or classnames with their full path put into HashSet. Or someone who uses HashSets of String documents instead of Lists because "HashSet always beats Lists".
I would do something like:
At the end hashcode is nothing more than a hint.