hash_map 将什么东西存储为键?

发布于 2024-09-07 10:43:34 字数 143 浏览 1 评论 0原文

我有无限的数字流,我必须检测第一个重复元素。我想到使用哈希表来解决上述问题,即每当一个数字到达时,检查它是否已经存在于哈希表中。如果有,请停止,否则将该数字添加到哈希表中。现在我的问题是哈希表是否存储整数值或仅存储与这些整数相对应的哈希值作为键?

提前致谢

I have an infinite streams of numbers coming and I have to detect the first duplicate element. I think of using hash table for the above problem i.e whenever a number arrives, check whether it is already there in the hash table or not. In case it has, stop otherwise add that number to hash table. Now my question is does hash table stores the integers values or only the hash values corresponding to those integers as key?

Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

策马西风 2024-09-14 10:43:34

hash_map 是散列关联容器,这意味着它们将键与值关联起来。所以是的,你给它的值存储在 hash_map 中。通常,您无法直接访问哈希码。要解决 hash_map 的问题,您必须在获得 myInt 时执行类似 myHashMap[myInt] = true; 的操作。然后对于下一个 int,您将必须检查您的 int 是否已经存储......

因为您只需检查它是否在这里,这可能不是最好的选择。一套看起来很合适。你也许需要检查获取操作的性能,但我认为它在STL集中得到了很好的优化。

我的2c

hash_map are hashed associative container, meaning that they associate a key to a value. So yes, the value you give it is stored in the hash_map. Usually, you have not access to the hashcode directly. To solve your problem with an hash_map, you will have to do something like myHashMap[myInt] = true;when you get myInt. Then for the next int, you will have to check if your int is already stored ...

As you have just to check if it is here or not that's perhaps not the best choice. A set seems well fitted. You perhaps have to check the performance of the fetching operation, but I think it is well optimized in STL set.

my2c

暮倦 2024-09-14 10:43:34

哈希函数用于放置 对,因此当您插入一个数字作为键时,它会存储在函数分配的位置,而关联的数据作为值。

您只需尝试获取该整数作为键并增加它的值。

The hash function is used to place the pair <key,value>, so when you insert a number as a key, it get stored in the position assigned by the function with the data associated as value.

You only have to try to get that integer as a key and increase it value.

站稳脚跟 2024-09-14 10:43:34

答案取决于实现限制,但如果数字是 int32 或更小,我的首选方法是延迟分配 512MB(因此只有在您使用它时才会分配单个物理页,/dev/zero 的私有 mmap 是UNIX 下的传统方法)并将其用作位集。如果字长较小,那么您将不得不诉诸某种哈希(可能是多级)、k 叉树的变体或上述的某种组合。

请注意,多级哈希或

如果您正在查看大于 int32 的任何内容,则您没有提供足够的信息来提供建议。如果您无法为您的位组提供 512MB 的 RAM,则同样适用。

The answer depends on the implementation constraints, but if the numbers are int32's or smaller, my preferred approach is to lazy-allocate 512MB (so individual physical pages will be allocated only should you use it, a private mmap of /dev/zero is the traditional approach under unix) and use it as a bitset. If the word size is smaller then you will have to resort to some sort of hash (probably multi-level), variant of a k-ary tree or some combination of the above.

Note that the multi-level hash or

If you are looking at anything larger than an int32 you haven't provided enough information to provide advice. The same applies if you can't afford 512MB of ram for your bitset.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文