哈希表:当第二个哈希函数返回表大小的倍数时进行双重哈希
我正在 C++ 中实现一个哈希表,通过双重哈希使用开放寻址。
我知道双重哈希背后的基本原则是:
indexInProbingSequence = (originalIndex + i * hashFunction2(key)) % tableSize
我认为我已经正确实现了这部分。这是一项家庭作业,班级的政策是我不能就任何特定的代码片段征求建议,所以你在这方面必须相信我。
似乎给我带来问题的是,有时,某些键在接受第二个哈希函数时,返回的值是(素数)表大小的倍数。在这些情况下,探测序列中的所有索引都是相同的。例如,当:
originalIndex = 32
hashFunction2(key) = 3035446
tableSize = 211
探测顺序为:
(32 + 1 * 3035446) % 211 == 32
(32 + 2 * 3035446) % 211 == 32
等等。
我缺少什么?
I am implementing a HashTable in C++, using open addressing via double hashing.
I understand that the basic principle behind double hashing is that:
indexInProbingSequence = (originalIndex + i * hashFunction2(key)) % tableSize
I think I have implemented this part correctly. This is for a homework assignment, and it is the policy of the class that I cannot ask advice on any specific piece of code, so you will have to trust me on that part.
What seems to be causing me problems is that occasionally, some keys, when subjected to the second hash function, return a value that is a multiple of the (prime) table size. In these cases, all the indices in the probing sequence are the same. For example, when:
originalIndex = 32
hashFunction2(key) = 3035446
tableSize = 211
The probing sequence is:
(32 + 1 * 3035446) % 211 == 32
(32 + 2 * 3035446) % 211 == 32
and so on.
What am I missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您没有错过任何内容,特别是当
hashFunction2(key) == 0
时,无论表大小如何,都会出现问题。使用
(hashFunction2(key) % (tableSize - 1) + 1)
代替hashFunction2(key)
。理想情况下,步幅是环以表大小为模的生成器(这是一种时髦的说法,表示您的探针最终覆盖整个表),否则至少会有一个很大的周期。由于你的表大小是素数,这意味着你必须避免 0。I don't think you've missed anything, and in particular the problem arises regardless of table size when
hashFunction2(key) == 0
.Use
(hashFunction2(key) % (tableSize - 1) + 1)
in place ofhashFunction2(key)
. It's desirable that the stride is a generator of the ring modulo the table size (which is a posh way of saying that your probe eventually covers the whole table), or failing that at least has a large period. Since your table size is prime that just means you have to avoid 0.