在所有推回之后对 STL 列表进行排序还是仅使用 Multimap?
我们使用了 multimap时,这工作得很好,并且没有比以前花费更长的时间(使用一些测试数据)。只要我们在添加完所有项目后对其进行排序就可以了。令我们惊讶的是,我们发现将所有项目添加到 multimap 仍然轻松地击败了将所有项目添加到列表然后排序的总时间。
这对我们 EE 类型来说没有意义,因为根据我们的想法,对 multimap 的每次插入都必须遍历列表,然后将其附加到末尾,而与列表一样,我们只需添加到末尾(通过推回) ,那么希望排序不会花费那么长时间。
还有一个事实:我们最初在没有对列表进行排序的情况下进行了比较测试,并且很高兴地看到使用列表的速度显着加快。然后我们添加了排序,有点震惊......
有哪位 CS 大师愿意参与进来吗?
We were using a multimap<int,string> to store several hundred thousand items (>300K), when we realized we needed to add more data for analysis. So we created a class that held a few items and the necessary overridden operators for stl and used a multimap<ourStruct,String>. This worked fine and didn't take much longer than before (with some test data), when we then realized an stl <list> would do just fine, as long as we sorted it after we finished adding all the items. To our surprise, we found that adding all items to multimap still easily beats the total time to add all items to list, and then sort.
This doesn't make sense to us EE types, since by our thinking every insert to multimap would have to traverse the list then tack it on to the end, where as with list we would just add on to the end (via push back), then hopefully the sort wouldn't take as long.
One more factoid: we intially did the comparison test with out sorting the list and were thrilled to see significant speed ups in speed using list. Then we added the sort, and were a bit stunned...
Any of the CS gurus out there care to weigh in?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
std::multimap
使用平衡树1,因此当您插入项目时它不会遍历整个列表。插入时遍历的项目数大约是集合中项目数的以 2 为底的对数。根据您所说的,您最好的选择可能是将数据放入向量中,然后进行排序。
1 从技术上讲,该标准并不直接要求平衡树,但它需要能够按排序顺序遍历,以及在最坏情况下插入和删除的对数复杂度,而我不知道许多其他数据结构可以满足该要求。
std::multimap
uses a balanced tree1, so it does not traverse the entire list when you insert an item. The number of items traversed for an insert is approximately the base 2 logarithm of the number of items in the collection.Based on what you've said, your best bet would probably be to put your data in a vector, and then sort.
1 Technically, the standard doesn't directly require a balanced tree, but it requires ability to traverse in sorted order, and logarithmic complexity for insertions and deletions in the worst case, and I'm not aware of many other data structures that can meet that requirement.
删除了对哈希的引用..平衡树就是为什么只需要 n2 遍历的原因。
Removed ref to hash .. Balanced tree is why only an n2 traverse is required.