使用 boost 无序映射
伙计们,我正在使用动态规划方法来解决问题。以下是该方法的简要概述。
- 生成的每个值均使用 25 个唯一键来标识。
- 我使用 boost::hash_combine 为使用这 25 个键的哈希表生成种子。
我将值存储在声明为的哈希表中
boost::unordered_map
; hashState; 我对我的算法进行了时间分析,发现近 95% 的运行时间都花在了检索数据/将数据插入到哈希表中。
这些是我的哈希表的详细信息
hashState.size() 1880
hashState.load_factor() 0.610588
hashState.bucket_count() 3079
hashState.max_size() 805306456
hashState.max_load_factor() 1
hashState.max_bucket_count() 805306457
我有以下两个问题
我可以做些什么来提高哈希表插入/检索操作的性能吗?
C++ STL 有 hash_multimap 这也适合我的要求。在插入/检索性能方面,boost 库 unordered_map 与 hash_multimap 相比如何。
Guys, I am using dynamic programming approach to solve a problem. Here is a brief overview of the approach
- Each value generated is identified using 25 unique keys.
- I use the boost::hash_combine to generate the seed for the hash table using these 25 keys.
I store the values in a hash table declared as
boost::unordered_map<Key_Object, Data_Object, HashFunction> hashState;
I did a time profiling on my algorithm and found that nearly 95% of the run time is spent towards retrieving/inserting data into the hash table.
These were the details of my hash table
hashState.size() 1880
hashState.load_factor() 0.610588
hashState.bucket_count() 3079
hashState.max_size() 805306456
hashState.max_load_factor() 1
hashState.max_bucket_count() 805306457
I have the following two questions
Is there anything which I can do to improve the performance of the Hash Table's insert/retrieve operations?
C++ STL has hash_multimap which would also suit my requirement. How does boost libraries unordered_map compare with hash_multimap in terms of insert/retrieve performance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果你的哈希函数不是罪魁祸首,你能做的最好的事情可能就是使用不同的映射实现。由于您的密钥非常大,因此使用 Boost.Intrusive 库 可能是最好的选择。或者,您可以尝试封闭式哈希:Google SparseHash 或 MCT,尽管肯定需要进行分析,因为当元素足够小时,建议使用封闭散列。 (SparseHash 更加成熟且经过充分测试,但 MCT 不需要这些
set_empty()
/set_deleted()
方法)。编辑:
我刚刚注意到我提到的 Boost 库中没有侵入式映射,只有集合和多重集合。不过,您可以尝试两个封闭的哈希库。
编辑2:
STL
hash_map
不是标准的,它可能是一些扩展并且不能跨编译器移植。If your hash function is not the culprit, the best you can do is probably using a different map implementation. Since your keys are quite large, using
unordered_map
from Boost.Intrusive library might be the best option. Alternatively, you could try closed hashing: Google SparseHash or MCT, though profiling is certainly needed because closed hashing is recommended when elements are small enough. (SparseHash is more established and well tested, but MCT doesn't need thoseset_empty()
/set_deleted()
methods).EDIT:
I just noticed there is no intrusive map in the Boost library I mentioned, only set and multiset. Still, you can try the two closed hashing libraries.
EDIT 2:
STL
hash_map
is not standard, it is probably some extension and not portable across compilers.您确定您使用的哈希函数不是瓶颈吗?
哈希函数占用的时间百分比是多少?
您可以进行相同的测试并通过对哈希的简单调用来替换插入/检索吗?
Are you sure that the hash function you are using is not the bottleneck?
Which time percentage takes the hash function?
Can you do the same test and replace the insert/retrievals by a simple call to the hash.