unordered_map真的是无序的吗?

发布于 2024-09-08 07:59:29 字数 646 浏览 11 评论 0原文

我对“unordered_map”这个名字感到非常困惑。顾名思义,这些键根本没有排序。但我一直认为它们是按哈希值排序的。或者这是错误的(因为名字暗示它们没有排序)?

或者换句话说: this

typedef map<K, V, HashComp<K> > HashMap;

template<typename T>
struct HashComp {
    bool operator<(const T& v1, const T& v2) const {
        return hash<T>()(v1) < hash<T>()(v2);
    }
};

相同

typedef unordered_map<K, V> HashMap;

吗? (好吧,不完全是这样,STL 会在这里抱怨,因为可能存在键 k1,k2,但既没有 k1 < k2 也没有 k2 < k1。您需要使用 multimap 并覆盖相等检查。 )

或者再次不同:当我迭代它们时,我可以假设键列表是按它们的哈希值排序的吗?

I am very confused by the name 'unordered_map'. The name suggests that the keys are not ordered at all. But I always thought they are ordered by their hash value. Or is that wrong (because the name implies that they are not ordered)?

Or to put it different: Is this

typedef map<K, V, HashComp<K> > HashMap;

with

template<typename T>
struct HashComp {
    bool operator<(const T& v1, const T& v2) const {
        return hash<T>()(v1) < hash<T>()(v2);
    }
};

the same as

typedef unordered_map<K, V> HashMap;

? (OK, not exactly, STL will complain here because there may be keys k1,k2 and neither k1 < k2 nor k2 < k1. You would need to use multimap and overwrite the equal-check.)

Or again differently: When I iterate through them, can I assume that the key-list is ordered by their hash value?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

可遇━不可求 2024-09-15 07:59:30

在回答您编辑的问题时,这两个片段根本不相等。 std::map 将节点存储在树结构中,unordered_map 将它们存储在哈希表*中。

键不按照其“哈希值”的顺序存储,因为它们根本不按任何顺序存储。相反,它们存储在“桶”中,其中每个桶对应于一系列哈希值。基本上,实现是这样的:

function add_value(object key, object value) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       buckets[bucket_index] = new linked_list();
   }
   buckets[bucket_index].add(new key_value(key, value));
}

function get_value(object key) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       return null;
   }

   foreach(key_value kv in buckets[bucket_index]) {
       if (kv.key == key) {
           return kv.value;
       }
   }
}

显然,这是一个严重的简化,真正的实现会更高级(例如,支持调整存储桶数组的大小,可能使用树结构而不是存储桶的链表) ,等等),但这应该让您了解如何无法以任何特定顺序取回值。有关详细信息,请参阅维基百科

* 从技术上讲,std::mapunordered_map 的内部实现是实现定义的,但标准要求操作具有一定的 Big-O 复杂性意味着那些内部实现

In answer to your edited question, no those two snippets are not equivalent at all. std::map stores nodes in a tree structure, unordered_map stores them in a hashtable*.

Keys are not stored in order of their "hash value" because they're not stored in any order at all. They are instead stored in "buckets" where each bucket corresponds to a range of hash values. Basically, the implementation goes like this:

function add_value(object key, object value) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       buckets[bucket_index] = new linked_list();
   }
   buckets[bucket_index].add(new key_value(key, value));
}

function get_value(object key) {
   int hash = key.getHash();

   int bucket_index = hash % NUM_BUCKETS;
   if (buckets[bucket_index] == null) {
       return null;
   }

   foreach(key_value kv in buckets[bucket_index]) {
       if (kv.key == key) {
           return kv.value;
       }
   }
}

Obviously that's a serious simplification and real implementation would be much more advanced (for example, supporting resizing the buckets array, maybe using a tree structure instead of linked list for the buckets, and so on), but that should give an idea of how you can't get back the values in any particular order. See wikipedia for more information.

* Technically, the internal implementation of std::map and unordered_map are implementation-defined, but the standard requires certain Big-O complexity for operations that implies those internal implementations

清秋悲枫 2024-09-15 07:59:30

“无序”并不意味着在实现中的某个地方不存在线性序列。它的意思是“你不能对这些元素的顺序做出任何假设”。

例如,人们经常假设条目将从哈希映射中出来的顺序与放入的顺序相同。但事实并非如此,因为条目是无序的。

至于“按哈希值排序”:哈希值通常取自整个整数范围,但哈希映射中没有 2**32 个槽。通过对槽数取模,哈希值的范围将减少到槽数。此外,当您向哈希映射添加条目时,它可能会更改大小以适应新值。这可能会导致所有先前的条目被重新放置,从而改变它们的顺序。

在无序数据结构中,您不能假设任何有关条目顺序的信息。

"Unordered" doesn't mean that there isn't a linear sequence somewhere in the implementation. It means "you can't assume anything about the order of these elements".

For example, people often assume that entries will come out of a hash map in the same order they were put in. But they don't, because the entries are unordered.

As for "ordered by their hash value": hash values are generally taken from the full range of integers, but hash maps don't have 2**32 slots in them. The hash value's range will be reduced to the number of slots by taking it modulo the number of slots. Further, as you add entries to a hash map, it might change size to accommodate the new values. This can cause all the previous entries to be re-placed, changing their order.

In an unordered data structure, you can't assume anything about the order of the entries.

安静 2024-09-15 07:59:30

正如名称 unordered_map 所示,C++0x 标准未指定任何顺序。 unordered_map 的表观顺序将取决于实际实现的方便程度。

As the name unordered_map suggests, no ordering is specified by the C++0x standard. An unordered_map's apparent ordering will be dependent on whatever is convenient for the actual implementation.

誰ツ都不明白 2024-09-15 07:59:30

如果您想要类比,请查看您选择的 RDBMS。

如果执行查询时未指定 ORDER BY 子句,则返回的结果将是“无序”的,即按照数据库感觉的任何顺序。顺序没有指定,系统可以随意“排序”它们,以获得最佳性能。

If you want an analogy, look at the RDBMS of your choice.

If you don't specify an ORDER BY clause when performing a query, the results are returned "unordered" - that is, in whatever order the database feels like. The order is not specified, and the system is free to "order" them however it likes in order to get the best performance.

迷鸟归林 2024-09-15 07:59:30

你是对的,unordered_map实际上是哈希排序的。请注意,大多数当前实现(TR1 之前)将其称为 hash_map

IBM C/C++ 编译器 文档指出,如果您有一个最佳散列函数,则在查找、插入和删除任意元素期间执行的操作数量并不取决于序列中的元素,所以这意味着顺序不是那么无序...

现在,它是哈希有序意味着什么?由于哈希应该是不可预测的,因此根据定义,您不能对映射中元素的顺序做出任何假设。这就是它在TR1中被重命名的原因:旧名称暗示了一个命令。现在我们知道订单实际上已被使用,但您可以忽略它,因为它是不可预测的。

You are right, unordered_map is actually hash ordered. Note that most current implementations (pre TR1) call it hash_map.

The IBM C/C++ compiler documentation remarks that if you have an optimal hash function, the number of operations performed during lookup, insertion, and removal of an arbitrary element does not depend on the number of elements in the sequence, so this mean that the order is not so unordered...

Now, what does it mean that it is hash ordered? As an hash should be unpredictable, by definition you can't take any assumption about the order of the elements in the map. This is the reason why it has been renamed in TR1: the old name suggested an order. Now we know that an order is actually used, but you can disregard it as it is unpredictable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文