哈希表:开放寻址和删除元素
我试图理解哈希表中的开放寻址,但有一个问题在我的文献中没有得到解答。它涉及如果使用二次探测则删除此类哈希表中的元素。然后被删除的元素被哨兵元素替换。然后 get() 操作知道它必须进一步执行,并且 add() 方法将覆盖它找到的第一个哨兵。但是,如果我想添加一个元素,其键已存在于哈希表中,但位于探测路径中的哨兵后面,会发生什么情况? add() 方法不会覆盖表中已有的相同键的实例值,而是覆盖哨兵。然后我们在哈希表中有多个具有相同键的元素。我认为这是一个问题,因为它会消耗内存,而且从哈希表中删除元素只会删除其中的第一个元素,因此仍然可以在表中找到该元素(即,它不会被删除)。
因此,在替换哨兵元素之前,似乎有必要在整个探测路径中搜索要插入的元素的键。我是否忽略了什么?实践中如何处理这个问题?
I'm trying to understand open addressing in hash tables but there is one question which isn't answered in my literature. It concerns the deletion of elements in such a hash table if quadratic probing is used. Then the removed element is replaced by a sentinel element. The get() operation then knows that it has to go further and the add() method would overwrite the first sentinel it finds. But what happens if I want to add an element with a key that is already in the hash table but behind a sentinel in a probing path? Instead of overwriting the value of the instance with the same key which is already in the table, the add() method would overwrite the sentinel. And then we have multiple elements with the same key in the hash table. I see that as a problem since it costs memory and also since removing the element from the hash table would merely remove the first of them, so that the element could still be found in the table (i.e. it is not removed).
So it seems that it is necessary to search the whole probing path for the key of the element one wants to insert before replacing a sentinel element. Am I overlooking something? How is this problem handled in practice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
add()
必须检查探测路径中哨兵之后的每个元素,直到找到空元素,正如您稍后指出的那样。如果在探测路径中找不到新元素,并且其上有哨兵元素,则可以使用第一个哨兵槽来存储新元素。http://www.algolist.net/Data_structs/Hash_table/Open_addressing< 上有一个哈希表实现< /a> (HashMap.java)。它的 put() 方法正是这样做的。 (冲突解决是引用片段中的线性探测,但我认为从算法的角度来看这不是一个重要的区别。)
经过大量删除操作后,表中可能有太多哨兵元素。解决方案是偶尔重建哈希表(即重新哈希所有内容)(基于项目数量和哨兵元素数量)。此操作将消除哨兵元素。
另一种方法是在删除元素时从探测路径中消除哨兵 (DELETED) 元素。实际上,在这种情况下,表中没有哨兵元素;只有空闲和占用的插槽。它可能很贵。
是的。您必须进行搜索,直到找到空元素。
我对现实生活中的哈希表实现不太了解。我想其中很多都可以在互联网上的开源项目中找到。我刚刚检查了
Hashtable
< /a> 和HashMap
Java 中的类。两者都使用链接而不是开放寻址。add()
has to check every element after the sentinel(s) in the probing path till it finds an empty element, as you pointed out later. If it could not find the new element in the probing path and there are sentinel elements on it, it can use the first sentinel slot to store the new element.There is a hash table implementation on http://www.algolist.net/Data_structures/Hash_table/Open_addressing (HashMap.java). Its
put()
method does exactly this. (The collision resolution is linear probing in the referenced snippet but I don't think it's an important difference from the point of view of the algorithm.)After a lot of remove operations there could be too many sentinel elements in the table. A solution for this would be to rebuild the hash table (i.e. rehash everything) occasionally (based on the number of items and the number of sentinel elements). This operation would eliminate the sentinel elements.
Another approach is eliminating the sentinel (DELETED) element from the probing path when you remove an element. Practically, you don't have sentinel elements in the table in this case; there are only FREE and OCCUPIED slots. It could be expensive.
Yes, it is. You have to search until you find an empty element.
I don't know too much about real life hash table implementations. I suppose plenty of them are available on the internet in open source projects. I've just checked the
Hashtable
andHashMap
classes in Java. Both use chaining instead of open addressing.很抱歉迟到的答案,但 Java 有一个具有开放寻址的哈希表示例:
java.util.IdentityHashMap
。此外,您还可以使用 GNU Trove 项目。它的映射都是开放寻址哈希表,如其概述页面中所述。
Sorry for the late answer, but Java has an example of a hash table with open addressing:
java.util.IdentityHashMap
.Also, you can use GNU Trove Project. Its maps are all open addressing hash tables, as explained on its overview page.