为什么在C++中找到最有效的元素?

发布于 2025-01-31 15:46:29 字数 1303 浏览 3 评论 0原文

我需要一个快速的STL容器来查找其中是否存在元素,因此我测试了数组,向量,集合和无序集。我认为,由于独特和有序的值,该集合被优化用于查找元素,但是1000万次迭代的最快是: 阵列(0.3秒) 向量(1.7秒) 无序集(1.9秒) 集合(3秒)

这是代码:

#include <algorithm>
#include <iostream>
#include <set>
#include <unordered_set>
#include <vector>

int main() {
    using std::cout, std::endl, std::set, std::unordered_set, std::vector, std::find;
    
    int i;
    
    const long ITERATIONS = 10000000;
    
    int a[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
    for (int i = 0; i < ITERATIONS; i++) {
        if (find(a, a + 16, rand() % 64) == a + 16) {}
        else {}
    }
    
    vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
    for (i = 0; i < ITERATIONS; i++) {
        if (find(v.begin(), v.end(), rand() % 64) == v.end()) {}
        else {}
    }
    
    set<int> s({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
    for (i = 0; i < ITERATIONS; i++) {
        if (find(s.begin(), s.end(), rand() % 64) == s.end()) {}
        else {}
    }
    
    unordered_set<int> us({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
    for (i = 0; i < ITERATIONS; i++) {
        if (find(us.begin(), us.end(), rand() % 64) == us.end()) {}
        else {}
    }
}

I need a fast STL container for finding if an element exists in it, so I tested arrays, vectors, sets, and unordered sets. I thought that sets were optimized for finding elements, because of unique and ordered values, but the fastest for 10 million iterations are:
arrays (0.3 secs)
vectors (1.7 secs)
unordered sets (1.9 secs)
sets (3 secs)

Here is the code:

#include <algorithm>
#include <iostream>
#include <set>
#include <unordered_set>
#include <vector>

int main() {
    using std::cout, std::endl, std::set, std::unordered_set, std::vector, std::find;
    
    int i;
    
    const long ITERATIONS = 10000000;
    
    int a[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
    for (int i = 0; i < ITERATIONS; i++) {
        if (find(a, a + 16, rand() % 64) == a + 16) {}
        else {}
    }
    
    vector<int> v{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
    for (i = 0; i < ITERATIONS; i++) {
        if (find(v.begin(), v.end(), rand() % 64) == v.end()) {}
        else {}
    }
    
    set<int> s({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
    for (i = 0; i < ITERATIONS; i++) {
        if (find(s.begin(), s.end(), rand() % 64) == s.end()) {}
        else {}
    }
    
    unordered_set<int> us({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
    for (i = 0; i < ITERATIONS; i++) {
        if (find(us.begin(), us.end(), rand() % 64) == us.end()) {}
        else {}
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

月下客 2025-02-07 15:46:29

请记住,在cc ++中有就像规则
这意味着,只要运行代码的可观察结果保持不变,编译器就可以通过任何方式(即使是删除代码)来转换代码。

这是 godbolt 您的代码。

现在请注意编译器对if(find(a,a + 16,rand()%64)== a + 16){}

.L206:
        call    rand
        sub     ebx, 1
        jne     .L206

基本上是编译器未使用并删除的结果,一切都期望呼叫rand()具有副作用(结果可见)。

std :: vector

.L207:
        call    rand
        sub     ebx, 1
        jne     .L207

甚至对于std :: Setstd :: unordered_set编译器也能够执行相同的优化。您看到的差异(您没有指定您的工作方式)只是初始化所有这些变量的结果,这对于更复杂的容器来说很耗时。

编写良好的性能测试很难,应该谨慎进行。

您的问题也有第二个问题。给定代码的时间复杂性。
搜索数组和搜索std :: Setstd :: Unrodered_set缩放与数据尺寸不同。对于小数据集,简单的数组将是命运,因为它的简单实现和对内存的最佳访问。随着数据大小的增长std ::的时间复杂性::查找数组的将以o(n)查找项目将以o(log n)std :: unordered_seto(log n) 它将是恒定的时间o(1)。因此,对于少量数据阵列将是最快的,对于中等大小std :: Set是赢家,如果数据数量将是大的std :: Unordered_set最好是最好的。

看看此基准示例示例使用Google-Benchmark。

Please remember that in C and C++ there is the as if rule!
This means compiler can transform code by any means (even by dropping code) as long as observable result of running code remains unchanged.

Here is godbolt of your code.

Now note what compiler did for if (find(a, a + 16, rand() % 64) == a + 16) {}:

.L206:
        call    rand
        sub     ebx, 1
        jne     .L206

Basically compiler noticed that result of it is not used and remove everything expect calling rand() which has side effects (visible changes in results).

The same happen for std::vector:

.L207:
        call    rand
        sub     ebx, 1
        jne     .L207

And even for std::set and std::unordered_set compiler was able to perform same optimization. The difference you are seeing (you didn't specified how you did that) is just result of initializing all of this variables which is time consuming for more complex containers.

Writing good performance test is hard and should be approached with caution.

There is also second problem with your question. Time complexity of given code.
Searching array and searching std::set and std::unrodered_set scales differently to size of data. For small data set simple array will be fates since its simple implementation and optimal access to memory. As data size grow time complexity of std::find for array will grow as O(n) on other hand slower std::set time to find item will grow as O(log n) and for std::unordered_set it will be constant time O(1). So for small amount of data array will be fastest, for medium size std::set is the winner and if amount of data will be large std::unordered_set will be best.

Take a look on this benchmark example which uses google benchmark.

救星 2025-02-07 15:46:29

您不是在衡量效率,而是在衡量性能。而且这样做很糟糕。

env> env中的地址空间随机化或仅不同的用户名或其他变量的效果对速度的影响约为40%。这不仅仅是-O0和-O2之间的差异。您只能测量一个单个系统,其中一个单个地址空间布局10000000次。这使得无意义的价值。

但是,您仍然设法弄清楚16 int的任何尝试都比仅仅看待所有尝试都会表现更糟糕的尝试。在单个缓存线路中进行简单的线性搜索(如果布局不良,更有可能的话,则只是两个)只是最好的方法。

现在,使用10000000 INT重试,运行二进制1000次。如果您使用布局随机器将布局的事故排除在时间安排之外,更好。

注意:iirc的排序极限,其中有气泡排序的数组最好在32到64个int之间,并且发现更简单。

You are not measuring efficiency, you are measuring performance. And doing so badly.

The effect of address space randomization or just different usernames or other variable in env has up to about 40% effect on speed. That's more than the difference between -O0 and -O2. You are only measuring one single system with one single address space layout 10000000 times. That makes the value about meaningless.

And yet you still manage to figure out that for 16 ints any attempt to be better than just looking at all of them will perform worse. A simple linear search in a single cache line (or two if the layout is bad, more likely) is just simply the best way.

Now try again with 10000000 ints and run the binary 1000 times. Even better if you use a layout randomizer to truly exclude accidents of the layout from the timing.

Note: Iirc the limit for sort where an array with bubble sort is best is somewhere between 32 and 64 ints and find is even simpler.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文