哪种 STL 容器最适合 std::sort? (这还重要吗?)

发布于 2024-07-16 08:05:13 字数 101 浏览 3 评论 0原文

标题不言而喻......

容器的选择是否会以某种方式影响默认 std::sort 算法的速度? 例如,如果我使用列表,排序算法只是切换节点指针还是切换节点中的整个数据?

The title speaks for itself ....

Does choice of container affects the speed of the default std::sort algorithm somehow or not? For example, if I use list, does the sorting algorithm just switch the node pointers or does it switch the whole data in the nodes?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

清风夜微凉 2024-07-23 08:05:13

选择确实会产生影响,但预测哪个容器最有效是非常困难的。 最好的方法是使用最适合您的应用程序使用的容器(可能是 std::vector),查看该容器的排序是否足够快,如果是,则坚持使用它。 如果没有,请对排序问题进行性能分析,并根据分析数据选择不同的容器。

作为一名前讲师和前培训师,我有时觉得个人对链表具有神秘的性能增强特性这一普遍观点负有责任。 听一位知情者的话:链表出现在这么多教科书和教程中的唯一原因是因为对于编写这些书籍和教程的人来说,拥有一个可以说明指针、动态内存管理、递归的数据结构很方便,搜索和排序合二为一——这与效率无关。

The choice does make a difference, but predicting which container will be the most efficient is very difficult. The best approach is to use the container that is easiest for your application to work with (probably std::vector), see if sorting is adequately fast with that container, and if so stick wth it. If not, do performance profiling on your sorting problem and choose different container based on the profile data.

As an ex-lecturer and ex-trainer, I sometimes feel personally responsible for the common idea that a linked list has mystical performance enhancing properties. Take it from one who knows: the only reason a linked list appear in so many text books and tutorials is because it is covenient for the people who wrote those books and tutorials to have a data structure that can illustrate pointers, dynamic memory mangement, recursion, searching and sorting all in one - it has nothing to do with efficiency.

若沐 2024-07-23 08:05:13

我认为 std::sort 不适用于列表,因为它需要一个 list 未提供的随机访问迭代器。 请注意,list<> 提供了 sort 方法,但它与 std::sort 完全分开。

容器的选择很重要。 STL 的 std::sort 依赖迭代器来抽象容器存储数据的方式。 它只是使用您提供的迭代器来移动元素。 这些迭代器在访问和分配元素方面的工作速度越快,std::sort 的工作速度就越快。

I don't think std::sort works on lists as it requires a random access iterator which is not provided by a list<>. Note that list<> provides a sort method but it's completely separate from std::sort.

The choice of container does matter. STL's std::sort relies on iterators to abstract away the way a container stores data. It just uses the iterators you provide to move elements around. The faster those iterators work in terms of accessing and assigning an element, the faster the std::sort would work.

爱你是孤单的心事 2024-07-23 08:05:13

std::list 绝对不是 std::sort() 的好(有效)选择,因为 std::sort() 需要随机访问迭代器。 std::map 和朋友也不好,因为元素的位置无法强制执行; 也就是说,用户无法通过插入特定位置或交换来强制映射元素在映射中的位置。 在标准容器中,我们只剩下 std::vectorstd::deque

std::sort() 与其他标准算法类似,它仅通过交换元素值 (*t = *s) 来起作用。 因此,即使列表神奇地支持 O(1) 访问,链接也不会被重新组织,而是它们的值会被交换。

由于 std::sort() 不会更改容器的大小,因此无论您使用 std::vector 还是 std:: ,运行时性能都不会产生任何差异。双端队列。 原始数组的排序速度也应该很快,甚至可能比标准容器更快——但我不认为速度差异足以证明使用它们的合理性。

std::list is definitely not a good (valid) choice for std::sort(), because std::sort() requires random-access iterators. std::map and friends are also no good because an element's position cannot be enforced; that is, the position of an element in a map cannot be enforced by the user with insertion into a particular position or a swap. Among the standard containers we're down to std::vector and std::deque.

std::sort() is like other standard algorithms in that it only acts by swapping elements' values around (*t = *s). So even if list would magically support O(1) access the links wouldn't be reorganized but rather their values would be swapped.

Because std::sort() doesn't change the container's size it should make no difference in runtime performance whether you use std::vector or std::deque. Primitive arrays should be also fast to sort, probably even faster than the standard containers -- but I don't expect the difference in speed to be significant enough to justify using them.

做个ˇ局外人 2024-07-23 08:05:13

这取决于元素类型。

如果您只是存储指针(或 POD),那么向量将是最快的。 如果您要存储对象,那么列表的排序会更快,因为它将交换节点而不是物理元素。

It depends on the element type.

If you're just storing pointers (or POD) then vector will be fastest. If you're storing objects then list's sort will be faster as it will swap nodes and not physical elements.

挽心 2024-07-23 08:05:13

排序算法对您的容器一无所知。 它所知道的只是随机访问迭代器。 因此,您可以对 STL 容器中不存在的内容进行排序。 因此,它的速度取决于您提供的迭代器,以及取消引用和复制它们所指向的内容的速度。

std::sort 不适用于 std::list,因为排序需要随机访问迭代器。 对于这种情况,您应该使用 std::list 的成员函数类型之一。 这些成员函数将有效地交换链表指针,而不是复制元素。

The sort algorithm knows nothing about your container. All it knows about are random-access iterators. Thus you can sort things that aren't even in a STL container. So how fast it is going to be depends on the iterators you give it, and how fast it is to dereference and copy what they point to.

std::sort won't work on std::list, since sort requires random access iterators. You should use one of std::list's member function sorts for that case. Those member functions will efficiently swap around linked list pointers instead of copying elements.

江挽川 2024-07-23 08:05:13

向量。

始终使用矢量作为默认值。 与任何其他容器相比,它具有最低的空间开销和最快的访问速度(以及 C 兼容布局和随机访问迭代器等其他优点)。

现在,问问自己 - 您还用容器做什么? 您需要强大的异常保证吗? 列表、集合和映射可能是更好的选择(尽管它们都有自己的排序例程)。 您需要定期将元素添加到容器的前面吗? 考虑双端队列。 您的容器是否需要始终进行分类? 集合和地图可能更合适。

最后,明确什么是最适合您的,然后选择最合适的容器并衡量它如何满足您的需求。

Vector.

Always use vector as your default. It has the lowest space overheads and fastest access of any other container (among other advantages like C-compatible layout and random-access iterators).

Now, ask yourself - what else you doing with your container? Do you need strong exception guarantees? List, set and map are likely to be better options (though they all have their own sort routines). Do you need to regularly add elements to the front of your container? Consider deque. Does your container need to always be sorted? Set and map are likely to be a better fit.

Finally, figure out specifically what "best" is for you and then choose the most appropriate container and measure how it performs for your needs.

绻影浮沉 2024-07-23 08:05:13

我完全同意上面的人发表的言论。 但学习新事物的最佳方法是什么? 嘿!!!! 当然不是阅读文本并死记硬背,但是......示例:D 最近我沉浸在 STL 指定的容器中,这里是不言自明的快速测试代码,我希望:

#include <iostream>
#include <vector>
#include <deque>
#include <array>
#include <list>
#include <iterator>
#include <cstdlib>
#include <algorithm>
#include "Timer.h"

constexpr int SIZE = 1005000;

using namespace std;

void test();

int main(){
    cout<<"array allocates "<<static_cast<double>(SIZE)/(1024*1024)<<" MB\n";
    test();


    return 0;
}


void test(){
    int values[SIZE];
    int size = 0;

    //init values to sort:
    do{
        values[size++] = rand() % 100000;
    }while(size < SIZE);

    //feed array with values:
    array<int, SIZE> container_1;
    for(int i = 0; i < SIZE; i++)
        container_1.at(i) = values[i];

    //feed vector with values
    vector<int> container_2(begin(values), end(values));
    list<int> container_3(begin(values), end(values)); 
    deque<int> container_4(begin(values), end(values)); 

    //meassure sorting time for containers
    {
       Timer t1("sort array");
       sort(container_1.begin(), container_1.end());
    }

    {
       Timer t2("sort vector");
       sort(container_2.begin(), container_2.end());
    }

    {
       Timer t3("sort list");
       container_3.sort();
    }

    {
       Timer t4("sort deque");
       sort(container_4.begin(), container_4.end());
    }

}

定时器的代码:

#include <chrono>
#include <string>
#include <iostream>

using namespace std;

class Timer{

public:
    Timer(string name = "unnamed") : mName(name){ mStart = chrono::system_clock::now();}
    ~Timer(){cout<<"action "<<mName<<" took: "<<
             chrono::duration_cast<chrono::milliseconds>(
                     chrono::system_clock::now() - mStart).count()<<"ms"<<endl;}
private:
    chrono::system_clock::time_point mStart;
    string mName;
};

这里是不使用优化时的结果(g++ --std=c++11 file.cpp -o a.out):

数组分配 0.958443 MB
排序数组的操作花费了:183ms
动作排序向量花费:316ms
操作排序列表花费:725ms
操作排序双端队列花费:436ms

并且经过优化(g++ -O3 --std=c++11 file.cpp -o a.out):

数组分配0.958443 MB< br>
排序数组的操作花费了:55ms
动作排序向量花费:57ms
操作排序列表花费:264ms
操作排序双端队列花费:67ms

请注意,尽管向量和数组在这种情况下排序的时间相似,但数组大小受到限制,因为它应该在堆栈上初始化(默认情况下,不使用自己的分配器等),

因此它还取决于您是否使用编译器优化,如果没有,我们可能会看到明显的差异。

I totally agree with the statements that guys have posted above. But what is the best way to learn new things? Hey!!!! surely not reading the text and learning by heart but.... EXAMPLES :D As recently I immersed in containers specified in STL, here is the quick test code that is self-explanatory, I hope:

#include <iostream>
#include <vector>
#include <deque>
#include <array>
#include <list>
#include <iterator>
#include <cstdlib>
#include <algorithm>
#include "Timer.h"

constexpr int SIZE = 1005000;

using namespace std;

void test();

int main(){
    cout<<"array allocates "<<static_cast<double>(SIZE)/(1024*1024)<<" MB\n";
    test();


    return 0;
}


void test(){
    int values[SIZE];
    int size = 0;

    //init values to sort:
    do{
        values[size++] = rand() % 100000;
    }while(size < SIZE);

    //feed array with values:
    array<int, SIZE> container_1;
    for(int i = 0; i < SIZE; i++)
        container_1.at(i) = values[i];

    //feed vector with values
    vector<int> container_2(begin(values), end(values));
    list<int> container_3(begin(values), end(values)); 
    deque<int> container_4(begin(values), end(values)); 

    //meassure sorting time for containers
    {
       Timer t1("sort array");
       sort(container_1.begin(), container_1.end());
    }

    {
       Timer t2("sort vector");
       sort(container_2.begin(), container_2.end());
    }

    {
       Timer t3("sort list");
       container_3.sort();
    }

    {
       Timer t4("sort deque");
       sort(container_4.begin(), container_4.end());
    }

}

And the code for timer:

#include <chrono>
#include <string>
#include <iostream>

using namespace std;

class Timer{

public:
    Timer(string name = "unnamed") : mName(name){ mStart = chrono::system_clock::now();}
    ~Timer(){cout<<"action "<<mName<<" took: "<<
             chrono::duration_cast<chrono::milliseconds>(
                     chrono::system_clock::now() - mStart).count()<<"ms"<<endl;}
private:
    chrono::system_clock::time_point mStart;
    string mName;
};

Here is the result when no optimization is used (g++ --std=c++11 file.cpp -o a.out):

array allocates 0.958443 MB
action sort array took: 183ms
action sort vector took: 316ms
action sort list took: 725ms
action sort deque took: 436ms

and with optimization (g++ -O3 --std=c++11 file.cpp -o a.out):

array allocates 0.958443 MB
action sort array took: 55ms
action sort vector took: 57ms
action sort list took: 264ms
action sort deque took: 67ms

Notice that although vector and array has similar times sorting for this case, array size is limited as it is supposed to be initialized on stack (by default, not using own allocators etc.)

So it depends also if you use optimization for compiler, if not, we may see noticeable difference.

溇涏 2024-07-23 08:05:13

这确实很重要,因为不同的容器有不同的内存访问模式等,这可能会发挥作用。

但是,std::sort 不适用于 std::list<>::iterators,因为它们不是 RandomAccessIterators。 此外,尽管可以实现对 std::list<> 的专门化来打乱节点的指针,但它可能会产生奇怪且令人惊讶的语义后果 - 例如。 如果向量中的排序范围内有一个迭代器,则其值将在排序后发生变化,而对于此专业化来说,情况并非如此。

It surely does matter, just because different containers have different memory access patterns etc. which could play a role.

However, std::sort doesn't work on std::list<>::iterators as these are not RandomAccessIterators. Moreover, although it would be possible to implement a specialization for std::list<> that would shuffle the nodes' pointers, it would probably have strange and surprising semantic consequences - eg. if you have an iterator inside sorted range in a vector, its value will change after the sorting, which would not be true with this specialization.

绝情姑娘 2024-07-23 08:05:13

std::sort 需要随机访问迭代器,因此您唯一使用的选择是向量或双端队列。 它将交换值,并且猜测向量的执行速度可能会比双端队列稍快,因为它通常具有更简单的底层数据结构。 但差异可能非常小。

如果您使用 std::list,则有一个专门化 (std::list::sort) 应该交换指针而不是值。 然而,因为它不是随机访问,所以它将使用合并排序而不是快速排序,这可能意味着算法本身会慢一些。

无论如何,我认为答案通常是向量。 如果每个元素都有很大的类,因此复制开销在排序过程中占主导地位,那么列表可能会击败它。 或者,您可以将指向它们的指针存储在向量中,并提供自定义谓词来对它们进行适当的排序。

std::sort requires random access iterators, so your only options to use that are vector or deque. It will swap the values, and at a guess vector will probably perform slightly faster than deque because it typically has a simpler underlying data structure. The difference is likely very marginal though.

If you use a std::list, there is a specialisation (std::list::sort) which should swap the pointers rather than the values. However because it's not random access it'll use mergesort instead of quicksort, which will probably mean that the algorithm itself is a little slower.

Anyway, I think the answer is normally vector. If you have large classes for each element so copying overhead dominates the sorting process, list might beat it. Or alternatively you could store pointers to them in a vector and supply a custom predicate to sort them appropriately.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文