动态分配的数组或 std::vector

发布于 2024-07-26 11:55:43 字数 1156 浏览 4 评论 0原文

我正在尝试优化我的 C++ 代码。 我在互联网上搜索过使用动态分配的 C++ 数组与使用 std::vector 的比较,并且通常看到有利于 std::vector 的建议,并且两者之间的性能差异可以忽略不计。 例如这里 - 在中使用数组或 std::vectors C++,性能差距是多少?

然而,我编写了一些代码来测试迭代数组/向量并向元素赋值的性能,我通常发现使用动态分配的数组比使用向量快近 3 倍(我确实事先指定了向量的大小) )。 我用的是g++-4.3.2。

不过,我觉得我的测试可能忽略了我不知道的问题,因此我将不胜感激有关此问题的任何建议。

感谢

使用代码 -

#include <time.h>
#include <iostream>
#include <vector>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}

I'm trying to optimize my C++ code. I've searched the internet on using dynamically allocated C++ arrays vs using std::vector and have generally seen a recommendation in favor of std::vector and that the difference in performance between the two is negligible. For instance here - Using arrays or std::vectors in C++, what's the performance gap?.

However, I wrote some code to test the performance of iterating through an array/vector and assigning values to the elements and I generally found that using dynamically allocated arrays was nearly 3 times faster than using vectors (I did specify a size for the vectors beforehand). I used g++-4.3.2.

However I feel that my test may have ignored issues I don't know about so I would appreciate any advice on this issue.

Thanks

Code used -

#include <time.h>
#include <iostream>
#include <vector>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

居里长安 2024-08-02 11:55:43

对 C++ 容器进行基准测试时,启用大多数编译器优化非常重要。 我自己对 SO 的几个答案都违反了这一点 - 例如,当像operator[]这样的东西没有内联时,函数调用的开销可能非常大。

When benchmarking C++ comtainers, it's important to enable most compiler optimisations. Several of my own answers on SO have fallen foul of this - for example, the function call overhead when something like operator[] is not inlined can be very significant.

非要怀念 2024-08-02 11:55:43

只是为了好玩,尝试使用指针而不是整数索引来迭代普通数组(代码应该看起来像向量迭代,因为 STL 迭代器的要点是对于大多数操作来说看起来像指针算术)。 我敢打赌在这种情况下速度将完全相同。 这当然意味着您应该选择向量,因为它可以让您免去手动管理数组的麻烦。

Just for fun, try iterating over the plain array using a pointer instead of an integer index (the code should look just like the vector iteration, since the point of STL iterators is to appear like pointer arithmetic for most operations). I bet the speed will be exactly equal in that case. Which of course means you should pick the vector, since it will save you a world of headaches from managing arrays by hand.

浅浅 2024-08-02 11:55:43

关于标准库类(例如 std::vector)的问题是,是的,天真地说,它比原始数组要多得多的代码。 但所有这些都可以由编译器轻松内联,这意味着如果启用优化,它基本上会变成与使用原始数组相同的代码。 那么速度差异不是可以忽略不计的,而是不存在的。 所有开销都在编译时消除。

但这需要启用编译器优化。

The thing about the standard library classes such as std::vector is that yes, naively, it is a lot more code than a raw array. But all of it can be trivially inlined by the compiler, which means that if optimizations are enabled, it becomes essentially the same code as if you'd used a raw array. The speed difference then is not negligible but non-existent. All the overhead is removed at compile-time.

But that requires compiler optimizations to be enabled.

混吃等死 2024-08-02 11:55:43

我想你发现迭代和添加到 std::vector 比普通数组慢 3 倍的原因是迭代向量和进行分配的成本的组合。

编辑:

这是我在测试用例之前的最初假设; 然而运行测试用例(用 -O3 编译)显示相反的情况 - std::vector 实际上快了 3 倍,这让我感到惊讶。

我看不出 std::vector 如何比普通数组副本更快(当然不是快 3 倍) - 我认为对 std::vector 编译的代码应用了一些优化,而数组版本则没有发生这种情况。

原始基准测试结果:

$ ./array
array:  0.059375
vector: 0.021209
  • std::vector 速度提高了 3 倍。 再次进行相同的基准测试,除了添加一个额外的外循环来运行测试迭代器循环 1000 次:

    $ ./数组
    数组:21.7129
    矢量:21.6413

  • std::vector 现在〜与数组的速度相同。

编辑2

找到了! 因此,您的测试用例的问题在于,在向量情况下,保存数据的内存似乎已经在 CPU 缓存中 - 要么是通过初始化的方式,要么是由于调用 vec.end()< /代码>。 如果我在每次计时测试之前“预热”CPU 缓存,我会得到相同的数组和向量数字:

#include <time.h>
#include <iostream>
#include <vector>

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  // get vec into CPU cache.
  for (int i = 0; vecIt != vecEnd; i++) { *(vecIt++) = i; }
  vecIt = vec.begin();
  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  std::cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;

  int* arr = new int[9999999];

  // get arr into CPU cache.
  for (int i = 0; i < 9999999; i++) { arr[i] = i; }
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  std::cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;
}

这给出了以下结果:

$ ./array
vector: 0.020875
array: 0.020695

I imagine the reason why you found iterating and adding to std::vector 3 times slower than a plain array is a combination of the cost of iterating the vector and doing the assigment.

Edit:

That was my initial assumption before the testcase; however running the testcase (compiled with -O3) shows the converse - std::vector is actually 3 times faster, which surprised me.

I can't see how std::vector could be faster (certainly not 3 times faster) than a vanilla array copy - I think there's some optimisation being applied to the std::vector compiled code which isn't happening for the array version.

Original benchmark results:

$ ./array
array:  0.059375
vector: 0.021209
  • std::vector is 3x faster. Same benchmark again, except add an additional outer loop to run the test iterater loop 1000 times:

    $ ./array
    array: 21.7129
    vector: 21.6413

  • std::vector is now ~ the same speed as array.

Edit 2

Found it! So the problem with your test case is that in the vector case the memory holding the data appears to be already in the CPU cache - either by the way it is initialised, or due to the call to vec.end(). If I 'warm' up the CPU cache before each timing test, I get the same numbers for array and vector:

#include <time.h>
#include <iostream>
#include <vector>

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  // get vec into CPU cache.
  for (int i = 0; vecIt != vecEnd; i++) { *(vecIt++) = i; }
  vecIt = vec.begin();
  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i;
  }
  end = clock();
  std::cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;

  int* arr = new int[9999999];

  // get arr into CPU cache.
  for (int i = 0; i < 9999999; i++) { arr[i] = i; }
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  std::cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<std::endl;
}

This gives me the following result:

$ ./array
vector: 0.020875
array: 0.020695
拧巴小姐 2024-08-02 11:55:43

我同意rmeador的观点,

  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i; // <-- quick offset calculation
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i; // <-- not fair play :) - offset = arr + i*size(int)
  }

I agree with rmeador,

  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = i; // <-- quick offset calculation
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i; // <-- not fair play :) - offset = arr + i*size(int)
  }
寂寞陪衬 2024-08-02 11:55:43

我认为这里的答案很明显:没关系。 就像 jalf 所说的那样,代码最终将大致相同,但即使不是,请查看数字。 您发布的代码创建了一个包含 1000 万个项目的巨大数组,但迭代整个数组只需要百分之几秒。

即使您的应用程序确实正在处理这么多数据,无论您实际对这些数据执行什么操作,都可能比迭代数组花费更多的时间。 只需使用您喜欢的数据结构,然后将时间集中在其余代码上。

为了证明我的观点,这里的代码做了一个更改:对数组项的 i 赋值被 sqrt(i) 的赋值替换。 在我使用 -O2 的机器上,执行时间增加了三倍,从 0.02 秒增加到 0.06 秒。

#include <time.h>
#include <iostream>
#include <vector>
#include <math.h>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = sqrt(i);
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}

I think the answer here is obvious: it doesn't matter. Like jalf said the code will end up being about the same, but even if it wasn't, look at the numbers. The code you posted creates a huge array of 10 MILLION items, yet iterating over the entire array takes only a few hundredths of a second.

Even if your application really is working with that much data, whatever it is you're actually doing with that data is likely to take much more time than iterating over your array. Just use whichever data structure you prefer, and focus your time on the rest of your code.

To prove my point, here's the code with one change: the assignment of i to the array item is replaced with an assignment of sqrt(i). On my machine using -O2, the execution time triples from .02 to .06 seconds.

#include <time.h>
#include <iostream>
#include <vector>
#include <math.h>

using namespace std;

int main() {
  clock_t start,end;
  std::vector<int> vec(9999999);
  std::vector<int>::iterator vecIt = vec.begin();
  std::vector<int>::iterator vecEnd = vec.end();

  start = clock();
  for (int i = 0; vecIt != vecEnd; i++) {
    *(vecIt++) = sqrt(i);
  }
  end = clock();
  cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

  int* arr = new int[9999999];
  start = clock();
  for (int i = 0; i < 9999999; i++) {
    arr[i] = i;
  }
  end = clock();
  cout<<"array: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
}
悲念泪 2024-08-02 11:55:43

问题似乎是您在关闭优化的情况下编译了代码。 在我的机器上,OS X 10.5.7 和 g++ 4.0.1 我实际上发现向量比原始数组快 2.5 倍。

使用 gcc 尝试将 -O2 传递给编译器,看看是否有任何改进。

The issue seems to be that you compiled your code with optimizations turned off. On my machine, OS X 10.5.7 with g++ 4.0.1 I actually see that the vector is faster than primitive arrays by a factor of 2.5.

With gcc try to pass -O2 to the compiler and see if there's any improvement.

胡大本事 2024-08-02 11:55:43

数组迭代速度更快的原因是迭代次数是恒定的,并且编译器能够展开循环。 尝试使用 rand 生成一个数字,并将其乘以一个您想要的大数字,这样编译器将无法在编译时算出它。 然后再试一次,你会看到类似的运行结果。

The reason that your array iterating is faster is that the the number of iteration is constant, and compiler is able to unroll the loop. Try to use rand to generate a number, and multiple it to be a big number you wanted so that compiler wont be able to figure it out at compile time. Then try it again, you will see similar runtime results.

夜灵血窟げ 2024-08-02 11:55:43

您的代码执行效果可能不完全相同的原因之一是,在您的 std::vector 版本上,您增加了两个值,即整数 i 和 std::vector::iterator vecIt。 为了真正等效,你可以重构为

start = clock();
for (int i = 0; i < vec.size(); i++) {
  vec[i] = i;
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;

One reason you're code might not be performing quite the same is because on your std::vector version, you are incrimenting two values, the integer i and the std::vector::iterator vecIt. To really be equivalent, you could refactor to

start = clock();
for (int i = 0; i < vec.size(); i++) {
  vec[i] = i;
}
end = clock();
cout<<"vector: "<<(double)(end-start)/CLOCKS_PER_SEC<<endl;
月野兔 2024-08-02 11:55:43

您的代码在两种情况之间提供了不公平的比较,因为您在向量测试中所做的工作比数组测试要多得多。

使用向量,您可以增加迭代器 (vecIT) 和单独的变量 (i) 来生成赋值。

对于数组,您只需递增变量 i 并将其用于双重目的。

Your code provides an unfair comparison between the two cases since you're doing far more work in the vector test than the array test.

With the vector, you're incrementing both the iterator (vecIT) and a separate variable (i) for generating the assignment values.

With the array, you're only incrementing the variable i and using it for dual purpose.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文