在 C++ 中使用 HashMap 的最佳方法是什么?

发布于 2024-09-16 00:34:59 字数 76 浏览 5 评论 0原文

我知道 STL 有 HashMap API,但我找不到任何好的、完整的文档以及与此相关的好示例。

任何好的例子将不胜感激。

I know that STL has a HashMap API, but I cannot find any good and thorough documentation with good examples regarding this.

Any good examples will be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

要走干脆点 2024-09-23 00:35:00

标准库包括有序和无序映射(std::mapstd::unordered_map )容器。在有序映射 (std::map) 中,元素按键排序,插入和访问位于 O(log n)。通常标准库内部使用 红黑树 来表示有序映射。但这只是一个实现细节。在无序映射 (std::unordered_map) 中,插入和访问的时间复杂度为 O(1)。它只是哈希表的另一个名称。

(有序)std::map 的示例:

#include <map>
#include <iostream>
#include <cassert>

int main(int argc, char **argv)
{
  std::map<std::string, int> m;
  m["hello"] = 23;
  // check if key is present
  if (m.find("world") != m.end())
    std::cout << "map contains key world!\n";
  // retrieve
  std::cout << m["hello"] << '\n';
  std::map<std::string, int>::iterator i = m.find("hello");
  assert(i != m.end());
  std::cout << "Key: " << i->first << " Value: " << i->second << '\n';
  return 0;
}

输出:

23
Key: hello Value: 23

如果您需要在容器中进行排序并且适合 O(log n) 运行时,那么只需使用 std::map.

否则,如果您确实需要哈希表(O(1) 插入/访问),请查看 std::unordered_map,它与 std::map 类似API(例如,在上面的示例中,您只需搜索 map 并将其替换为 unordered_map)。

unordered_map 容器是随 C++11 标准 修订。因此,根据您的编译器,您必须启用 C++11 功能(例如,当使用 GCC 4.8 时,您必须将 -std=c++11 添加到 CXXFLAGS)。

即使在 C++11 版本之前,GCC 也支持 unordered_map - 在命名空间 std::tr1 中。因此,对于旧的 GCC 编译器,您可以尝试像这样使用它:

#include <tr1/unordered_map>

std::tr1::unordered_map<std::string, int> m;

它也是 boost 的一部分,即您可以使用相应的 boost-header 以获得更好的可移植性。

The standard library includes the ordered and the unordered map (std::map and std::unordered_map) containers. In an ordered map (std::map) the elements are sorted by the key, insert and access is in O(log n). Usually the standard library internally uses red black trees for ordered maps. But this is just an implementation detail. In an unordered map (std::unordered_map) insert and access is in O(1). It is just another name for a hashtable.

An example with (ordered) std::map:

#include <map>
#include <iostream>
#include <cassert>

int main(int argc, char **argv)
{
  std::map<std::string, int> m;
  m["hello"] = 23;
  // check if key is present
  if (m.find("world") != m.end())
    std::cout << "map contains key world!\n";
  // retrieve
  std::cout << m["hello"] << '\n';
  std::map<std::string, int>::iterator i = m.find("hello");
  assert(i != m.end());
  std::cout << "Key: " << i->first << " Value: " << i->second << '\n';
  return 0;
}

Output:

23
Key: hello Value: 23

If you need ordering in your container and are fine with the O(log n) runtime then just use std::map.

Otherwise, if you really need a hash-table (O(1) insert/access), check out std::unordered_map, which has a similar to std::map API (e.g. in the above example you just have to search and replace map with unordered_map).

The unordered_map container was introduced with the C++11 standard revision. Thus, depending on your compiler, you have to enable C++11 features (e.g. when using GCC 4.8 you have to add -std=c++11 to the CXXFLAGS).

Even before the C++11 release GCC supported unordered_map - in the namespace std::tr1. Thus, for old GCC compilers you can try to use it like this:

#include <tr1/unordered_map>

std::tr1::unordered_map<std::string, int> m;

It is also part of boost, i.e. you can use the corresponding boost-header for better portability.

北渚 2024-09-23 00:35:00

hash_map 是出于标准化目的而称为 unordered_map 的较旧的非标准化版本(最初在 TR1 中,自 C++11 起包含在标准中)。顾名思义,它与 std::map 的不同之处主要在于它是无序的——例如,如果您迭代从 begin()的映射end() 中,您可以按键 1 按顺序获取项目,但如果您从 begin() 迭代一个 unordered_mapend(),您以或多或少任意的顺序获得项目。

unordered_map 通常预计具有恒定的复杂性。也就是说,插入、查找等通常基本上花费固定量的时间,而不管表中有多少项。 std::map 的复杂性与所存储的项目数量成对数 - 这意味着插入或检索项目的时间会增加,但速度相当缓慢,因为地图变大。例如,如果查找 100 万个项目之一需要 1 微秒,那么您可以预期查找 200 万个项目之一需要大约 2 微秒,查找 400 万个项目之一需要 3 微秒,查找 800 万个项目之一需要 4 微秒但从

实际角度来看,这并不是故事的全部。从本质上讲,简单的哈希表具有固定的大小。使其适应通用容器的可变大小要求有些不简单。因此,(可能)增加表的操作(例如,插入)可能相对较慢(也就是说,大多数操作相当快,但有时会慢得多)。查找不能改变表的大小,通常要快得多。因此,与插入次数相比,当您进行大量查找时,大多数基于哈希的表往往会处于最佳状态。对于插入大量数据,然后遍历表一次以检索结果的情况(例如,计算文件中唯一单词的数量),std::map 可能只是一样快,而且很可能甚至更快(但是,同样,计算复杂性是不同的,因此这也可能取决于文件中唯一单词的数量)。



1 其中顺序由创建地图时的第三个模板参数定义,默认情况下为 std::less

A hash_map is an older, unstandardized version of what for standardization purposes is called an unordered_map (originally in TR1, and included in the standard since C++11). As the name implies, it's different from std::map primarily in being unordered -- if, for example, you iterate through a map from begin() to end(), you get items in order by key1, but if you iterate through an unordered_map from begin() to end(), you get items in a more or less arbitrary order.

An unordered_map is normally expected to have constant complexity. That is, an insertion, lookup, etc., typically takes essentially a fixed amount of time, regardless of how many items are in the table. An std::map has complexity that's logarithmic on the number of items being stored -- which means the time to insert or retrieve an item grows, but quite slowly, as the map grows larger. For example, if it takes 1 microsecond to lookup one of 1 million items, then you can expect it to take around 2 microseconds to lookup one of 2 million items, 3 microseconds for one of 4 million items, 4 microseconds for one of 8 million items, etc.

From a practical viewpoint, that's not really the whole story though. By nature, a simple hash table has a fixed size. Adapting it to the variable-size requirements for a general purpose container is somewhat non-trivial. As a result, operations that (potentially) grow the table (e.g., insertion) are potentially relatively slow (that is, most are fairly fast, but periodically one will be much slower). Lookups, which cannot change the size of the table, are generally much faster. As a result, most hash-based tables tend to be at their best when you do a lot of lookups compared to the number of insertions. For situations where you insert a lot of data, then iterate through the table once to retrieve results (e.g., counting the number of unique words in a file) chances are that an std::map will be just as fast, and quite possibly even faster (but, again, the computational complexity is different, so that can also depend on the number of unique words in the file).



1 Where the order is defined by the third template parameter when you create the map, std::less<T> by default.

情痴 2024-09-23 00:35:00

这是一个更完整和灵活的示例,它没有省略生成编译错误所需的包含:

#include <iostream>
#include <unordered_map>

class Hashtable {
    std::unordered_map<const void *, const void *> htmap;

public:
    void put(const void *key, const void *value) {
            htmap[key] = value;
    }

    const void *get(const void *key) {
            return htmap[key];
    }

};

int main() {
    Hashtable ht;
    ht.put("Bob", "Dylan");
    int one = 1;
    ht.put("one", &one);
    std::cout << (char *)ht.get("Bob") << "; " << *(int *)ht.get("one");
}

对于键仍然不是特别有用,除非它们被预定义为指针,因为匹配值不会这样做! (但是,由于我通常使用字符串作为键,因此在键的声明中用“string”替换“const void *”应该可以解决这个问题。)

Here's a more complete and flexible example that doesn't omit necessary includes to generate compilation errors:

#include <iostream>
#include <unordered_map>

class Hashtable {
    std::unordered_map<const void *, const void *> htmap;

public:
    void put(const void *key, const void *value) {
            htmap[key] = value;
    }

    const void *get(const void *key) {
            return htmap[key];
    }

};

int main() {
    Hashtable ht;
    ht.put("Bob", "Dylan");
    int one = 1;
    ht.put("one", &one);
    std::cout << (char *)ht.get("Bob") << "; " << *(int *)ht.get("one");
}

Still not particularly useful for keys, unless they are predefined as pointers, because a matching value won't do! (However, since I normally use strings for keys, substituting "string" for "const void *" in the declaration of the key should resolve this problem.)

白云不回头 2024-09-23 00:35:00

std::unordered_map 在 GCC stdlibc++ 6.4 中使用哈希映射的证据

这在以下位置提到:https://stackoverflow.com/a/3578247/895245,但在以下答案中:C++ 中的 std::map 内部有什么数据结构? 我已经通过以下方式为 GCC stdlibc++ 6.4 实现提供了进一步的证据:

  • GDB 步骤调试到类
  • 性能特征分析

以下是该答案中描述的性能特征图的预览:

在此处输入图像描述

因此我们可以清楚地看到 std::unordered_set 具有几乎恒定的时间访问,除了当大小加倍并且底层动态数组必须在内存中重新定位时,会出现一些非常慢的点。

如何将自定义类和哈希函数与unordered_map结合使用

这个答案说明了这一点:使用自定义类类型作为键的 C++ unordered_map

摘录:相等:

struct Key
{
  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
  { return (first == other.first
            && second == other.second
            && third == other.third);
  }
};

哈希函数:

namespace std {

  template <>
  struct hash<Key>
  {
    std::size_t operator()(const Key& k) const
    {
      using std::size_t;
      using std::hash;
      using std::string;

      // Compute individual hash values for first,
      // second and third and combine them using XOR
      // and bit shifting:

      return ((hash<string>()(k.first)
               ^ (hash<string>()(k.second) << 1)) >> 1)
               ^ (hash<int>()(k.third) << 1);
    }
  };

}

Evidence that std::unordered_map uses a hash map in GCC stdlibc++ 6.4

This was mentioned at: https://stackoverflow.com/a/3578247/895245 but in the following answer: What data structure is inside std::map in C++? I have given further evidence of such for the GCC stdlibc++ 6.4 implementation by:

  • GDB step debugging into the class
  • performance characteristic analysis

Here is a preview of the performance characteristic graph described in that answer:

enter image description here

So we can clearly see that std::unordered_set has almost constant time access, except for a few extremely slow points when the size doubles and the underlying dynamic array has to be relocated in memory.

How to use a custom class and hash function with unordered_map

This answer nails it: C++ unordered_map using a custom class type as the key

Excerpt: equality:

struct Key
{
  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
  { return (first == other.first
            && second == other.second
            && third == other.third);
  }
};

Hash function:

namespace std {

  template <>
  struct hash<Key>
  {
    std::size_t operator()(const Key& k) const
    {
      using std::size_t;
      using std::hash;
      using std::string;

      // Compute individual hash values for first,
      // second and third and combine them using XOR
      // and bit shifting:

      return ((hash<string>()(k.first)
               ^ (hash<string>()(k.second) << 1)) >> 1)
               ^ (hash<int>()(k.third) << 1);
    }
  };

}
茶花眉 2024-09-23 00:35:00

对于我们这些试图弄清楚如何在仍然使用标准模板的同时对我们自己的类进行哈希处理的人来说,有一个简单的解决方案:

  1. 在您的类中,您需要定义一个相等运算符重载== 。如果您不知道如何执行此操作,GeeksforGeeks 有一个很棒的教程 https://www .geeksforgeeks.org/operator-overloading-c/

  2. 在标准命名空间下,声明一个名为 hash 的模板结构,并将您的类名作为类型(见下文)。我发现了一篇很棒的博文,其中还展示了使用 XOR 和位移位计算哈希值的示例,但这超出了这个问题的范围,但它还包含有关如何使用哈希函数完成的详细说明 https://prateekvjoshi.com/2014/06/05 /using-hash-function-in-c-for-user-defined-classes/

namespace std {

  template<>
  struct hash<my_type> {
    size_t operator()(const my_type& k) {
      // Do your hash function here
      ...
    }
  };

}
  1. 因此,要使用新的哈希函数实现哈希表,您只需创建一个 std::map 或 std::unordered_map 就像您通常所做的那样并使用 my_type 作为键,标准库将自动使用您之前定义的哈希函数(在步骤 2 中) )来散列你的密钥。
#include <unordered_map>

int main() {
  std::unordered_map<my_type, other_type> my_map;
}

For those of us trying to figure out how to hash our own classes whilst still using the standard template, there is a simple solution:

  1. In your class you need to define an equality operator overload ==. If you don't know how to do this, GeeksforGeeks has a great tutorial https://www.geeksforgeeks.org/operator-overloading-c/

  2. Under the standard namespace, declare a template struct called hash with your classname as the type (see below). I found a great blogpost that also shows an example of calculating hashes using XOR and bitshifting, but that's outside the scope of this question, but it also includes detailed instructions on how to accomplish using hash functions as well https://prateekvjoshi.com/2014/06/05/using-hash-function-in-c-for-user-defined-classes/

namespace std {

  template<>
  struct hash<my_type> {
    size_t operator()(const my_type& k) {
      // Do your hash function here
      ...
    }
  };

}
  1. So then to implement a hashtable using your new hash function, you just have to create a std::map or std::unordered_map just like you would normally do and use my_type as the key, the standard library will automatically use the hash function you defined before (in step 2) to hash your keys.
#include <unordered_map>

int main() {
  std::unordered_map<my_type, other_type> my_map;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文