为什么 map 比 unordered_map 快得多?
我实现了一个搜索缓存结果,其中包含 State 类型的键(一个包含 7 个短整数的类)和 Score
类型的值(一个包含 3 个双精度数的类)。使用 unordered_map 至少慢 20 倍地图。为什么?
编辑:可恶!我的哈希函数是
namespace std {
size_t hash<State>::operator()(State const& s) const {
size_t retval = hash<short>()(s.s[0]);
for (int i = 1; i < R; i += 2) { // 1 3 5
int x = (static_cast<int>(s.s[i + 1]) << 16)
+ (static_cast<int>(s.s[i]));
hash_combine(retval, x);
}
}
}
我忘记返回 retval ,所以它全部发生冲突!我希望 unordered_map 有一个 hash_function_quality() 函数来报告平均冲突数。
I implemented a search caching results that consist of keys of type State (a class with 7 short ints) and values of type Score
(a class of 3 doubles.) Using unordered_map was at least 20 times slower than map. Why?
Edit: Darn it! My hash function was
namespace std {
size_t hash<State>::operator()(State const& s) const {
size_t retval = hash<short>()(s.s[0]);
for (int i = 1; i < R; i += 2) { // 1 3 5
int x = (static_cast<int>(s.s[i + 1]) << 16)
+ (static_cast<int>(s.s[i]));
hash_combine(retval, x);
}
}
}
I forgot to return retval
, so it was all colliding! I wish unordered_map had a hash_function_quality() function that reports the average number of collisions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
unordered_map 的速度与哈希函数的速度成正比。这从来都不是一种直接的关系。举个例子,如果您使用最简单的散列函数:
那么您最终会得到一个行为类似于列表而不是散列容器的集合。所有项目都将映射到一个存储桶,您必须遍历整个存储桶,直到找到所需的项目(这可能需要 O(N) 时间。)
您需要做的是查看两件事:
其中任何一个本身都可能并且将会破坏性能。
The speed of unordered_map is directly proportional to the speed of your hashing function. It is never a straight forward relationship. Case in point, if you use the simplest hashing function:
then what you'll end up with is a collection which behaves like a list rather than a hashed container. All the items will map to a single bucket and you'll have to traverse the entire bucket until you get to the item you desire (something that could take O(N) time.)
What you need to do is look at two things:
Either of those by themselves can and will kill the performance.
由于哈希函数的原因,
std::unordered_map
对于少量元素来说通常很慢。这需要固定的(-ish)时间,但可能仍然是相当长的时间。另一方面,
std::map
比std::unordered_map
更简单。访问元素所需的时间取决于元素的数量,但随着元素数量的增加,时间会越来越少。与std::unordered_map
相比,std::map 的大哦因子c
通常也非常小。一般来说,优先使用
std::map
而不是std::unordered_map
,除非您有特定原因使用std::unordered_map
。如果您没有大量元素,则尤其如此。std::unordered_map
is commonly slow for a small number of elements because of the hash function. It takes a fixed (-ish) amount of time, but maybe a significant amount of time nonetheless.std::map
on the other hand is simpler thanstd::unordered_map
. The time it takes accessing an element there depends on the count of elements, but less and less so as the number of elements grows. And the big-oh factorc
for a std::map is commonly very small too, compared tostd::unordered_map
.In general, prefer using
std::map
overstd::unordered_map
, unless you have a specific reason to usestd::unordered_map
. This holds particularly if you don't have a large number of elements.unordered_map
在底层使用哈希表,因此哈希性能不佳的最明显原因是冲突太多。您可以考虑使用不同的非默认哈希函数,这将为您的键类型提供更好的结果。unordered_map
uses a hash table under the hood, so most obvious reason why hash performs poorly is because you have too many collisions. You may consider using different, non-default, hash function that will give better results for your type of keys.为了
我认为以下功能可能会有所帮助。
load_factor
越低,哈希函数越好。For
I think the following function might be helpful.
Lower the
load_factor
, better is the hash function.