当前位置：文江博客话题详情

如何提高使用键类型 std::string 进行地图查找的性能？

发布于 2024-07-08 18:21:10 字数 422 浏览 17 评论 0原文

我使用的是 std::map （VC++ 实现），通过地图的 find 方法进行查找有点慢。

键类型是std::string。

我可以通过地图的自定义键比较覆盖来提高此 std::map 查找的性能吗？例如，也许 std::string 在比较数据之前，compare 没有考虑简单的 string::size() 比较？

还有其他加快比较速度的想法吗？

在我的情况下，地图将始终包含 < 15个元素，但查询不停，性能至关重要。也许我可以使用更好的数据结构，速度会更快？

更新：地图包含文件路径。

更新2：地图的元素经常变化。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

最冷一天 2024-07-15 18:21:10

甚至表示集合中使用的运算符是<而不是==。

如果您不关心 set 中字符串的顺序，您可以向 set 传递一个自定义比较器，该比较器的性能比常规 less-than< /em>.

例如，如果许多字符串具有相似的前缀（但长度不同），您可以按字符串长度排序（因为 string.length 是恒定速度）。

如果您这样做，请注意一个常见错误：

struct comp {
    bool operator()(const std::string& lhs, const std::string& rhs)
    {
        if (lhs.length() < rhs.length())
            return true;
        return lhs < rhs;
    }
};

此运算符不维护严格的弱排序，如它可以将两个字符串视为一个小于另一个。

string a = "z";
string b = "aa";

按照逻辑，您将看到 comp(a, b) == true 和 comp(b, a) == true。

正确的实现是：

struct comp {
    bool operator()(const std::string& lhs, const std::string& rhs)
    {
        if (lhs.length() != rhs.length())
            return lhs.length() < rhs.length();
        return lhs < rhs;
    }
};

As Even said the operator used in a set is < not ==.

If you don't care about the order of the strings in your set you can pass the set a custom comparator that performs better than the regular less-than.

For example if a lot of your strings have similar prefixes (but they vary in length) you can sort by string length (since string.length is constant speed).

If you do so beware a common mistake:

struct comp {
    bool operator()(const std::string& lhs, const std::string& rhs)
    {
        if (lhs.length() < rhs.length())
            return true;
        return lhs < rhs;
    }
};

This operator does not maintain a strict weak ordering, as it can treat two strings as each less than the other.

string a = "z";
string b = "aa";

Follow the logic and you'll see that comp(a, b) == true and comp(b, a) == true.

The correct implementation is:

struct comp {
    bool operator()(const std::string& lhs, const std::string& rhs)
    {
        if (lhs.length() != rhs.length())
            return lhs.length() < rhs.length();
        return lhs < rhs;
    }
};

回复收藏 0 原文

留蓝 2024-07-15 18:21:10

首先，关闭所有分析和调试开关。这些会极大地减慢 STL 的速度。

如果不是这样，部分问题可能是字符串的前 80-90% 是相同的。这对于地图来说不一定是坏事，但对于字符串比较来说却是这样。如果是这种情况，您的搜索可能需要更长的时间。

例如，在此代码中，find() 可能会导致几次字符串比较，但每次比较都会在比较第一个字符直到“david”后返回，然后将检查前三个字符。所以每次调用最多会检查 5 个字符。

map<string,int> names;
names["larry"] = 1;
names["david"] = 2;
names["juanita"] = 3;

map<string,int>::iterator iter = names.find("daniel");

另一方面，在下面的代码中，find() 可能会检查 135 个以上的字符：

map<string,int> names;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/wilma"] = 1;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/fred"] = 2;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/barney"] = 3;

map<string,int>::iterator iter = names.find("/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/betty");

这是因为字符串比较必须更深入地搜索才能找到匹配项，因为每个字符串的开头都是相同的。

在比较相等性时使用 size() 在这里不会有太大帮助，因为您的数据集非常小。 std::map 保持排序，因此可以使用二分搜索来搜索其元素。每次调用 find 都会导致少于 5 次字符串比较（未命中）和平均 2 次比较（命中）。但这确实取决于您的数据。如果大多数路径字符串的长度不同，那么像 Motti 描述的大小检查可能会有很大帮助。

在考虑替代算法时需要考虑的是您获得了多少“点击”。大多数 find() 调用是否返回 end() 还是命中？如果大多数 find() 返回 end() （未命中），那么您每次都会搜索整个地图（2logn 字符串比较）。

Hash_map是个好主意；它应该可以将您的搜索时间减少大约一半；更多的是错过。

由于路径字符串的性质，可能需要自定义算法，特别是如果您的数据集具有共同的祖先（如上面的代码所示）。

另一件需要考虑的事情是如何获取搜索字符串。如果您重用它们，将它们编码成更容易比较的东西可能会有所帮助。如果您使用它们一次并丢弃它们，那么这个编码步骤可能太昂贵了。

我曾经（很久以前）使用过类似霍夫曼编码树的东西来优化字符串搜索。像这样的二叉字符串搜索树在某些情况下可能会更有效，但对于像您这样的小集合来说它相当昂贵。

最后，研究替代的 std::map 实现。我听说过有关 VC 的某些 stl 代码性能的坏消息。 DEBUG 库尤其不能在每次调用时检查您。 StlPort 曾经是一个不错的选择，但我已经好几年没有尝试过了。我也一直很喜欢 Boost。

First, turn off all the profiling and DEBUG switches. These can slow down STL immensely.

If that's not it, part of the problem may be that your strings are identical for the first 80-90% of the string. This isn't bad for map, necessarily, but it is for string comparisons. If this is the case, your search can take much longer.

For example, in this code find() will likely result in a couple of string compares, but each will return after comparing the first character until "david", and then the first three characters will be checked. So at most, 5 characters will be checked per call.

map<string,int> names;
names["larry"] = 1;
names["david"] = 2;
names["juanita"] = 3;

map<string,int>::iterator iter = names.find("daniel");

On the other hand, in the following code, find() will likely check 135+ characters:

map<string,int> names;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/wilma"] = 1;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/fred"] = 2;
names["/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/barney"] = 3;

map<string,int>::iterator iter = names.find("/usr/local/lib/fancy-pants/share/etc/doc/foobar/longpath/yadda/yadda/betty");

That's because the string comparisons have to search deeper to find a match since the beginning of each string is the same.

Using size() in your comparison for equality won't help you much here since your data set is so small. A std::map is kept sorted so its elements can be searched with a binary search. Each call to find should result in less than 5 string comparisons for a miss, and an average of 2 comparisons for a hit. But it does depend on your data. If most of your path strings are of different lengths, then a size check like Motti describes could help a lot.

Something to consider when thinking of alternative algorithms is how many many "hits" you get. Are most of your find() calls returning end() or a hit? If most of your find()s return end() (misses) then you are searching the entire map every time (2logn string compares).

Hash_map is a good idea; it should cut your search time in about half for hits; more for misses.

A custom algorithm may be called for because of the nature of path strings, especially if your data set has common ancestry like in the above code.

Another thing to consider is how you get your search strings. If you are reusing them, it may help to encode them into something that is easier to compare. If you use them once and discard them, then this encoding step is probably too expensive.

I used something like a Huffman coding tree once (a long time ago) to optimize string searches. A binary string search tree like that may be more efficient in some cases, but its pretty expensive for small sets like yours.

Finally, look into alternative std::map implementations. I've heard bad things about some of VC's stl code performance. The DEBUG library in particular is bad about checking you on every call. StlPort used to be a good alternative, but I haven't tried it in a few years. I've always loved Boost too.

回复收藏 0 原文

森林很绿却致人迷途 2024-07-15 18:21:10

第一件事是尝试使用 hash_map 如果可能的话 - 你是对的，标准字符串比较不会首先检查大小（因为它按字典顺序比较），但编写自己的映射代码是你最好避免的事情。从您的问题来看，您似乎不需要迭代范围；在这种情况下，map 没有 hash_map 没有的东西。

它还取决于您的地图中拥有哪种类型的键。它们通常很长吗？还有“有点慢”是什么意思？如果您还没有对代码进行概要分析，那么很可能它是一个需要时间的不同部分。

更新：嗯，程序中的瓶颈是map::find，但map 的元素总是少于15 个。这让我怀疑这个配置文件在某种程度上具有误导性，因为在这么小的地图上找到一个东西根本不应该慢。事实上，map::find 应该非常快，只是分析的开销可能比 find 调用本身还要多。我不得不再问一次，你确定这真的是你程序的瓶颈吗？你说字符串是路径，但你在这个循环中没有进行任何类型的操作系统调用、文件系统访问、磁盘访问？其中任何一个都应该比小地图上的 map::find 慢几个数量级。实际上，任何获取字符串的方法都应该比 map::find 慢。

回复收藏 0 原文

橘和柠 2024-07-15 18:21:10

您可以尝试使用排序向量（这里是一个示例），这可能事实证明速度更快（您必须对其进行分析当然要确保）。

认为它会更快的原因：

更少的内存分配和释放（向量将扩展到使用的最大大小，然后重用释放的内存）。
具有随机访问的二进制查找应该比树遍历更快（特别是由于数据局部性）。

认为它会变慢的原因：

删除和添加意味着在内存中移动字符串，因为 string 的 swap 非常高效并且数据集的大小很小这可能不是问题。

回复收藏 0 原文

七色彩虹 2024-07-15 18:21:10

std::map 的比较器不是 std::equal_to 它是 std::less，我不确定短路 a < 的最佳方法是什么比较一下，它会比内置的更快。

如果总有< 15 个元素，也许您可以使用 std::string 之外的键？

回复收藏 0 原文

深府石板幽径 2024-07-15 18:21:10

莫蒂有一个很好的解决方案。但是，我很确定对于您的< 15 个元素的映射不是正确的方法，因为它的开销始终大于具有适当哈希方案的简单查找表的开销。在您的情况下，仅按长度进行散列可能就足够了，如果仍然产生冲突，请对相同长度的所有条目使用线性搜索。

为了确定我是否正确，当然需要一个基准，但我对其结果非常确定。

回复收藏 0 原文

上课铃就是安魂曲 2024-07-15 18:21:10

您可能会考虑预先计算字符串的哈希值，并将其保存在映射中。这样做可以在通过 std::map 树搜索期间提供散列比较而不是字符串比较的优势。

class HashedString
{
  unsigned m_hash;
  std::string m_string;

public:
  HashedString(const std::string& str)
    : m_hash(HashString(str))
    , m_string(str)
  {};
  // ... copy constructor and etc...

  unsigned GetHash() const {return m_hash;}
  const std::string& GetString() const {return m_string;}
};

这样做的好处是在构造时计算一次字符串的哈希值。之后，您可以实现一个比较函数：

struct comp
{
  bool operator()(const HashedString& lhs, const HashedString& rhs)
  {
    if(lhs.GetHash() < rhs.GetHash()) return true;
    if(lhs.GetHash() > rhs.GetHash()) return false;
    return lhs.GetString() < rhs.GetString();
  }
};

由于散列现在是在 HashedString 构造上计算的，因此它们以这种方式存储在 std::map 中，因此比较可以非常快地进行（整数比较）在天文数字般高的时间比例中，当哈希值相等时，会使用标准字符串进行比较。

You might consider pre-computing a hash for a string, and saving that in your map. Doing so gives the advantage of hash compares instead of string compares during the search through the std::map tree.

class HashedString
{
  unsigned m_hash;
  std::string m_string;

public:
  HashedString(const std::string& str)
    : m_hash(HashString(str))
    , m_string(str)
  {};
  // ... copy constructor and etc...

  unsigned GetHash() const {return m_hash;}
  const std::string& GetString() const {return m_string;}
};

This has the benefits of computing a hash of the string once, on construction. After this, you could implement a comparison function:

struct comp
{
  bool operator()(const HashedString& lhs, const HashedString& rhs)
  {
    if(lhs.GetHash() < rhs.GetHash()) return true;
    if(lhs.GetHash() > rhs.GetHash()) return false;
    return lhs.GetString() < rhs.GetString();
  }
};

Since hashes are now computed on HashedString construction, they are stored that way in the std::map, and so the compare can happen very quickly (an integer compare) in an astronomically high percentage of the time, falling back on standard string compares when the hashes are equal.

回复收藏 0 原文

渡你暖光 2024-07-15 18:21:10

也许您可以在将字符串用作地图中的键之前反转字符串？如果每个字符串的前几个字母相同，这可能会有所帮助。

回复收藏 0 原文

末蓝 2024-07-15 18:21:10

以下是您可以考虑的一些事项：

0) 您确定这是性能瓶颈所在吗？喜欢 Quantify、Cachegrind、gprof 或类似的结果吗？因为在这样的 smap 映射上查找应该相当快...

1) 您可以覆盖用于比较 std::map<> 中的键的函子，有第二个模板参数可以做到这一点。不过，我怀疑你能比 Operator< 做得更好。

2）地图内容变化大吗？如果不是，并且考虑到映射的尺寸非常小，也许使用排序向量和二分搜索可以产生更好的结果（例如，因为您可以更好地利用内存局部性。3

）元素在编译时是否已知？如果是这种情况，您可以使用完美的哈希函数来缩短查找时间。在网络上搜索 gperf。

4）你是否有很多查询都没有找到任何东西？如果是这样，也许与集合中的第一个和最后一个元素进行比较可以比每次完整搜索更快地消除许多不匹配。

这些已经被建议，但更详细：

5）由于您的字符串很少，也许您可以使用不同的密钥。例如，你们的钥匙尺寸都一样吗？您可以使用包含固定长度字符数组的类吗？您可以将字符串转换为数字或仅包含数字的某些数据结构吗？

回复收藏 0 原文

秋风の叶未落 2024-07-15 18:21:10

根据使用案例，您还可以使用一些其他技术。例如，我们有一个应用程序需要跟上超过一百万个不同的文件路径。问题是有数千个对象需要保存这些文件路径的小地图。

由于向数据集添加新文件路径是一项不常见的操作，因此当将路径添加到系统时，会搜索主地图。如果未找到路径，则会添加该路径并返回一个新的排序整数（从 1 开始）。如果路径已经存在，则返回先前分配的整数。然后，每个对象维护的每个映射都从基于字符串的映射转换为整数映射。这不仅极大地提高了性能，而且由于没有太多的字符串重复副本，因此减少了内存使用量。

当然，这是一个非常具体的优化。但当涉及到性能改进时，您经常发现自己必须针对特定问题制定量身定制的解决方案。

我讨厌字符串:) 并不是它们比较起来很慢，而是它们真的会在高性能软件上浪费你的 CPU 缓存。

回复收藏 0 原文

清泪尽 2024-07-15 18:21:10

尝试 std::tr1::unordered_map （在标头中找到）。这是一个哈希映射，虽然它不维护元素的排序顺序，但可能比常规映射快得多。

如果您的编译器不支持 TR1，请获取更新版本。 MSVC和gcc都支持TR1，我相信大多数其他编译器的最新版本也支持。不幸的是，许多库参考站点尚未更新，因此 TR1 仍然是一项很大程度上不为人知的技术。

我希望 C++0x 不是这样。

编辑：请注意，tr1::unordered_map 的默认哈希方法是 tr1::hash，它可能需要专门用于 UDT。

回复收藏 0 原文

放肆 2024-07-15 18:21:10

如果您有很长的公共子字符串，则 trie 可能是比 map 或 hash_map 更好的数据结构。不过，我说“可能” - hash_map 每次查找只遍历一次键，所以应该相当快。我不会进一步讨论它，因为其他人已经讨论过了。

如果某些键比其他键更频繁地查找，您也可以考虑使用展开树，但这当然会使最坏情况的查找比平衡树更糟糕，并且查找是变异操作，如果您正在使用，这可能对您很重要例如读写锁。

如果您更关心查找的性能而不是修改，那么使用 AVL 树可能比红黑树做得更好，我认为红黑树是 STL 实现通常用于映射的内容。 AVL 树通常具有更好的平衡性，因此平均每次查找所需的比较次数较少，但差异很小。

找到您满意的这些实现可能是一个问题。在 Boost 主页上搜索表明他们有 splay 和 AVL 树，但没有 trie。

您在评论中提到，您从未进行过找不到任何内容的查找。所以理论上你可以跳过最后的比较，在 15 < 的树中 2^4 个元素可以给你带来 20-25% 的加速，而无需做任何其他事情。事实上，可能还不止于此，因为相等的字符串比较起来最慢。是否值得仅仅为了这种优化而编写自己的容器是另一个问题。

您可能还考虑引用的局部性 - 我不知道您是否可以通过从小堆中分配键和节点来避免偶尔的页面丢失。如果您一次只需要大约 15 个条目，那么假设文件名限制低于 256 字节，您可以确保查找期间访问的所有内容都适合单个 4k 页面（当然，除了正在查找的键之外）。与几次页面加载相比，比较字符串可能微不足道。但是，如果这是您的瓶颈，则必须进行大量查找，因此我猜测所有内容都相当接近 CPU。也许值得检查一下。

另一个想法：如果您在存在大量争用的结构上使用悲观锁定（您在评论中说该程序是大规模多线程的），那么无论分析器告诉您什么（CPU 周期花费在什么代码中），通过有效地将您限制为 1 个核心，您可能会花费比您想象的更多的费用。尝试读写锁？

Where you have long common substrings, a trie might be a better data structure than a map or a hash_map. I said "might", though - a hash_map already only traverses the key once per lookup, so should be fairly fast. I won't discuss it further since others already have.

You could also consider a splay tree if some keys are more frequently looked up than others, but of course this makes the worst-case lookup worse than a balanced tree, and lookups are mutating operations, which may matter to you if you're using e.g. a reader-writer lock.

If you care about the performance of lookups more than modifications, you might do better with an AVL tree than a red-black, which I think is what STL implementations generally use for map. An AVL tree is typically better balanced and so will on average require fewer comparisons per lookup, but the difference is marginal.

Finding an implementation of these that you're happy with might be an issue. A search on the Boost main page suggests they have a splay and AVL tree but not a trie.

You mentioned in a comment that you never have a lookup that fails to find anything. So you could in theory skip the final comparison, which in a tree of 15 < 2^4 elements could give you something like a 20-25% speedup without doing anything else. In fact, maybe more than that, since equal strings are the slowest to compare. Whether it's worth writing your own container just for this optimisation is another question.

You might also consider locality of reference - I don't know whether you could avoid the occasional page miss by allocating the keys and the nodes out of a small heap. If you only need about 15 entries at a time, then assuming a file name limit below 256 bytes you could ensure that everything accessed during a lookup fits into a single 4k page (apart from the key being looked up, of course). It may be that comparing the strings is insignificant compared with a couple of page loads. However, if this is your bottleneck there must be an enormous number of lookups going on, so I'd guess that everything is reasonably close to the CPU. Worth checking, maybe.

Another thought: if you are using pessimistic locking on a structure where there's a lot of contention (you said in a comment the program is massively multi-threaded) then regardless of what the profiler tells you (what code the CPU cycles are spent in), it might be costing you more than you think by effectively limiting you to 1 core. Try a reader-writer lock?

回复收藏 0 原文

貪欢 2024-07-15 18:21:10

hash_map 不是标准的，请尝试使用 tr1 中提供的 unordered_map（如果您的工具链还没有，则可以在 boost 中使用）。

对于少量字符串，您可能最好使用vector，因为map通常以树的形式实现。

回复收藏 0 原文

滥情空心 2024-07-15 18:21:10

为什么不使用哈希表呢？ boost::unordered_map 可以做到。或者您可以推出自己的解决方案，并存储字符串的 crc 而不是字符串本身。或者更好的是，为字符串添加#defines，并使用它们进行查找，例如，

#define "STRING_1" STRING_1

Why don't you use a hashtable instead? boost::unordered_map could do. Or you can roll out your own solution, and store the crc of a string instead of the string itself. Or better yet, put #defines for the strings, and use those for lookup, e.g.,