c++哈希表,其中键是字符串,值是字符串向量

发布于 2024-11-16 03:49:35 字数 266 浏览 2 评论 0原文

我收集了大量独特的字符串(大约 500k)。每个字符串都与一个字符串向量相关联。我目前正在将这些数据存储在 a 中

map<string, vector<string> >

,并且工作正常。不过,我希望查找地图的速度比 log(n) 更快。在这些受限的情况下,如何创建支持 O(1) 查找的哈希表?看来这应该是可能的,因为我提前知道所有的密钥......并且所有的密钥都是唯一的(所以我不必考虑冲突)。

干杯!

I have a large collection of unique strings (about 500k). Each string is associated with a vector of strings. I'm currently storing this data in a

map<string, vector<string> >

and it's working fine. However I'd like the look-up into the map to be faster than log(n). Under these constrained circumstances how can I create a hashtable that supports O(1) look-up? Seems like this should be possible since I know all the keys ahead of time... and all the keys are unique (so I don't have to account for collisions).

Cheers!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

酒与心事 2024-11-23 03:49:35

您可以使用 boost::unordered_mapstd::tr1::unordered_map 或(在 C++0x 编译器上)std::unordered_map 创建哈希表代码>.这几乎需要零努力。 Google Sparsehash 可能速度更快,而且占用的内存也更少。 (删除可能很痛苦,但似乎您不需要这样做。)

如果代码仍然不够快,您可以按照其他人的建议,利用最小完美哈希值来利用密钥的先验知识,以获得有保证的 O (1)性能。代码生成工作是否值得取决于您;将 500k 个密钥放入像 gperf 这样的工具中可能需要代码生成器。

您可能还想看看 CMPH,它通过 C 语言在运行时生成完美的哈希函数API。

You can create a hashtable with boost::unordered_map, std::tr1::unordered_map or (on C++0x compilers) std::unordered_map. That takes almost zero effort. Google sparsehash may be faster still and tends to take less memory. (Deletion can be a pain, but it seems you won't need that.)

If the code is still not fast enough, you can exploit prior knowledge of the keys with a minimal perfect hash, as suggested by others, to obtain guaranteed O(1) performance. Whether the code generating effort that takes is worth it depends on you; putting 500k keys into a tool like gperf may take a code generator generator.

You may also want to look at CMPH, which generates a perfect hash function at run-time, though through a C API.

那伤。 2024-11-23 03:49:35

我会考虑为您的表创建一个完美哈希函数。这将保证不会发生冲突,而解决冲突是一项昂贵的操作。还提供完美哈希函数生成器

I would look into creating a Perfect Hash Function for your table. This will guarantee no collisions which are an expensive operation to resolve. Perfect Hash Function Generators are also available.

缱倦旧时光 2024-11-23 03:49:35

您正在寻找的是完美哈希gperf 通常用于生成这些,但我不知道它与此类的配合效果如何大量字符串集合。

What you're looking for is a Perfect Hash. gperf is often used to generate these, but I don't know how well it works with such a large collection of strings.

痴骨ら 2024-11-23 03:49:35

如果您不希望已知的密钥集合发生冲突,那么您正在寻找完美的哈希CMPH 库(我很抱歉,因为它是针对 C 而不是 C++)是成熟的,可以生成最小的完美哈希值相当大的数据集。

If you want no collisions for a known collection of keys you're looking for a perfect hash. The CMPH library (my apologies as it is for C rather than C++) is mature and can generate minimal perfect hashes for rather large data sets.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文