为什么 map 比 unordered_map 快得多?

发布于 2024-10-15 07:23:05 字数 630 浏览 7 评论 0原文

我实现了一个搜索缓存结果,其中包含 State 类型的键(一个包含 7 个短整数的类)和 Score 类型的值(一个包含 3 个双精度数的类)。使用 unordered_map 至少慢 20 倍地图。为什么?

编辑:可恶!我的哈希函数是

namespace std {
    size_t hash<State>::operator()(State const& s) const {
        size_t retval = hash<short>()(s.s[0]);
        for (int i = 1; i < R; i += 2) {  // 1 3 5
            int x = (static_cast<int>(s.s[i + 1]) << 16)
                + (static_cast<int>(s.s[i]));
            hash_combine(retval, x);
        }
    }
}

我忘记返回 retval ,所以它全部发生冲突!我希望 unordered_map 有一个 hash_function_quality() 函数来报告平均冲突数。

I implemented a search caching results that consist of keys of type State (a class with 7 short ints) and values of type Score (a class of 3 doubles.) Using unordered_map was at least 20 times slower than map. Why?

Edit: Darn it! My hash function was

namespace std {
    size_t hash<State>::operator()(State const& s) const {
        size_t retval = hash<short>()(s.s[0]);
        for (int i = 1; i < R; i += 2) {  // 1 3 5
            int x = (static_cast<int>(s.s[i + 1]) << 16)
                + (static_cast<int>(s.s[i]));
            hash_combine(retval, x);
        }
    }
}

I forgot to return retval, so it was all colliding! I wish unordered_map had a hash_function_quality() function that reports the average number of collisions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

来日方长 2024-10-22 07:23:05

unordered_map 的速度与哈希函数的速度成正比。这从来都不是一种直接的关系。举个例子,如果您使用最简单的散列函数:

std::size_t myHash(MyObjectType _object){ return 1; }

那么您最终会得到一个行为类似于列表而不是散列容器的集合。所有项目都将映射到一个存储桶,您必须遍历整个存储桶,直到找到所需的项目(这可能需要 O(N) 时间。)

您需要做的是查看两件事:

  1. 您使用什么哈希函数?处理过程是否需要花费大量时间?
  2. 它产生了多少次碰撞?也就是说,有多少唯一元素被映射到相同的哈希值?

其中任何一个本身都可能并且将会破坏性能。

The speed of unordered_map is directly proportional to the speed of your hashing function. It is never a straight forward relationship. Case in point, if you use the simplest hashing function:

std::size_t myHash(MyObjectType _object){ return 1; }

then what you'll end up with is a collection which behaves like a list rather than a hashed container. All the items will map to a single bucket and you'll have to traverse the entire bucket until you get to the item you desire (something that could take O(N) time.)

What you need to do is look at two things:

  1. What hashing function are you using? Does it cost a ridiculous amount of time to process?
  2. How many collisions is it producing? That is, how many unique elements are being mapped to the same hash value?

Either of those by themselves can and will kill the performance.

打小就很酷 2024-10-22 07:23:05

由于哈希函数的原因,std::unordered_map 对于少量元素来说通常很慢。这需要固定的(-ish)时间,但可能仍然是相当长的时间。

另一方面,std::mapstd::unordered_map 更简单。访问元素所需的时间取决于元素的数量,但随着元素数量的增加,时间会越来越少。与 std::unordered_map 相比,std::map 的大哦因子 c 通常也非常小。

一般来说,优先使用 std::map 而不是 std::unordered_map,除非您有特定原因使用 std::unordered_map。如果您没有大量元素,则尤其如此。

std::unordered_map is commonly slow for a small number of elements because of the hash function. It takes a fixed (-ish) amount of time, but maybe a significant amount of time nonetheless.

std::map on the other hand is simpler than std::unordered_map. The time it takes accessing an element there depends on the count of elements, but less and less so as the number of elements grows. And the big-oh factor c for a std::map is commonly very small too, compared to std::unordered_map.

In general, prefer using std::map over std::unordered_map, unless you have a specific reason to use std::unordered_map. This holds particularly if you don't have a large number of elements.

生生漫 2024-10-22 07:23:05

unordered_map 在底层使用哈希表,因此哈希性能不佳的最明显原因是冲突太多。您可以考虑使用不同的非默认哈希函数,这将为您的键类型提供更好的结果。

unordered_map uses a hash table under the hood, so most obvious reason why hash performs poorly is because you have too many collisions. You may consider using different, non-default, hash function that will give better results for your type of keys.

梦幻之岛 2024-10-22 07:23:05

为了

我希望 unordered_map 有一个
hash_function_quality() 函数
报告平均数量
碰撞。

我认为以下功能可能会有所帮助。

unordered_map::load_factor
    float load_factor() const;
The member function returns the average number of elements per bucket.

load_factor 越低,哈希函数越好。

For

I wish unordered_map had a
hash_function_quality() function that
reports the average number of
collisions.

I think the following function might be helpful.

unordered_map::load_factor
    float load_factor() const;
The member function returns the average number of elements per bucket.

Lower the load_factor, better is the hash function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文