在所有推回之后对 STL 列表进行排序还是仅使用 Multimap?

发布于 2024-11-02 19:16:24 字数 521 浏览 2 评论 0原文

我们使用了 multimap当我们意识到需要添加更多数据进行分析时,存储数十万个项目(> 300K)。因此,我们创建了一个类,其中包含一些项目和 stl 所需的重写运算符,并使用了 multimap。当我们意识到一个 stl时,这工作得很好,并且没有比以前花费更长的时间(使用一些测试数据)。只要我们在添加完所有项目后对其进行排序就可以了。令我们惊讶的是,我们发现将所有项目添加到 multimap 仍然轻松地击败了将所有项目添加到列表然后排序的总时间。
这对我们 EE 类型来说没有意义,因为根据我们的想法,对 multimap 的每次插入都必须遍历列表,然后将其附加到末尾,而与列表一样,我们只需添加到末尾(通过推回) ,那么希望排序不会花费那么长时间。
还有一个事实:我们最初在没有对列表进行排序的情况下进行了比较测试,并且很高兴地看到使用列表的速度显着加快。然后我们添加了排序,有点震惊......
有哪位 CS 大师愿意参与进来吗?

We were using a multimap<int,string> to store several hundred thousand items (>300K), when we realized we needed to add more data for analysis. So we created a class that held a few items and the necessary overridden operators for stl and used a multimap<ourStruct,String>. This worked fine and didn't take much longer than before (with some test data), when we then realized an stl <list> would do just fine, as long as we sorted it after we finished adding all the items. To our surprise, we found that adding all items to multimap still easily beats the total time to add all items to list, and then sort.
This doesn't make sense to us EE types, since by our thinking every insert to multimap would have to traverse the list then tack it on to the end, where as with list we would just add on to the end (via push back), then hopefully the sort wouldn't take as long.
One more factoid: we intially did the comparison test with out sorting the list and were thrilled to see significant speed ups in speed using list. Then we added the sort, and were a bit stunned...
Any of the CS gurus out there care to weigh in?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

沉鱼一梦 2024-11-09 19:16:24

std::multimap 使用平衡树1,因此当您插入项目时它不会遍历整个列表。插入时遍历的项目数大约是集合中项目数的以 2 为底的对数。

根据您所说的,您最好的选择可能是将数据放入向量中,然后进行排序。


1 从技术上讲,该标准并不直接要求平衡树,但它需要能够按排序顺序遍历,以及在最坏情况下插入和删除的对数复杂度,而我不知道许多其他数据结构可以满足该要求。

std::multimap uses a balanced tree1, so it does not traverse the entire list when you insert an item. The number of items traversed for an insert is approximately the base 2 logarithm of the number of items in the collection.

Based on what you've said, your best bet would probably be to put your data in a vector, and then sort.


1 Technically, the standard doesn't directly require a balanced tree, but it requires ability to traverse in sorted order, and logarithmic complexity for insertions and deletions in the worst case, and I'm not aware of many other data structures that can meet that requirement.

檐上三寸雪 2024-11-09 19:16:24

删除了对哈希的引用..平衡树就是为什么只需要 n2 遍历的原因。

Removed ref to hash .. Balanced tree is why only an n2 traverse is required.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文