C++为什么使用全球变量处理比本地变量快?
我制作了一个带有愚蠢处理的C ++代码,只是为了测试一些缓存优化以研究代码改进并逐渐介入非常奇怪的东西...
将数组用作静态或在主函数之外声明,因为全局代码在0.5秒内运行(平均)而且,如果我只是将阵列移至主函数的内部,则相同的处理在15秒内(平均)运行。我找不到原因,我只是发现文章谈论本地变量的速度比当地人更快。
有人知道发生了什么事吗? 我正在使用MINGW安装的Windows中使用C ++编译,并在带有i3-7100的桌面上运行代码。
编辑:
- 目标不是提高速度,只是研究缓存,移动,删除或合并数组的使用的一些测试。实际上,优化标志将代码更改为完美的代码和完美的速度。但是什么在变化?标志固定的阵列位置有什么问题?
- 我只是用G ++ -O
编辑编辑2:
我将数组初始化如建议的一些注释一样,实际上不是初始化的速度较慢,并且初始化的速度与全局代码的速度相同。 为什么?引擎盖下发生了什么?
编辑3:
在评论和解释中提出了一些建议之后,我通过@paddy创建的公平测试环境进行了一些测试,并在评论中共享: https://godbolt.org/z/85qjp5
代码:
#include <chrono>
#include <iostream>
#define TAM 10
#define N 10000
#ifdef USE_GLOBAL
volatile double output[N] = {}, values[N] = {}, error[N] = {};
#endif
int main()
{
#ifndef USE_GLOBAL
volatile double output[N] = {}, values[N] = {}, error[N] = {};
#endif
std::cout << "Starting" << std::endl;
auto t1 = std::chrono::high_resolution_clock::now();
{
for (int total = 0; total < TAM; total++) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
output[i] += (values[j] + error[j]) / i + 1;
}
}
}
}
auto t2 = std::chrono::high_resolution_clock::now();
auto duration =
(std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1)
.count());
float time = (float)duration / 1000000;
std::cout << "Processing time = " << time << " seconds."
<< std::endl;
}
I made a C++ code with dumb processing just to test a few cache optimizations to study code improvements and stepped into something very weird...
Using the arrays as static or declaring outside the main function as global the code runs in 0.5 seconds (average) and if I just move the arrays to inside of the main function, the same processing runs in 15 seconds (average). I can't find why, and I'm just finding articles talking about how local variables are faster than locals.
Does someone have any idea what's happening?
I'm compiling with C++ in Windows, installed using the MingW, and running the code on a Desktop with i3-7100.
EDIT:
- The goal is not the speed improvements, is just some tests to study the use of the cache, moving, removing or merge arrays. The optimization flags are changing the code to the perfect code and perfect speed in fact. But whats it's changing? Whats was wrong in the arrays location that is being fixed by the flags?
- I'm just compiling with g++ -o
EDIT 2:
I made the array initializations like some comments suggested and the not initialized is in fact the slower, and the initialized is the same speed of the global code.
Why? What's happenning under the hood?
EDIT 3:
After some suggestions in the comments and explanations I made some testing with a fair test enviroment created by @paddy and shared in the comments: https://godbolt.org/z/8ev85qjP5
Code:
#include <chrono>
#include <iostream>
#define TAM 10
#define N 10000
#ifdef USE_GLOBAL
volatile double output[N] = {}, values[N] = {}, error[N] = {};
#endif
int main()
{
#ifndef USE_GLOBAL
volatile double output[N] = {}, values[N] = {}, error[N] = {};
#endif
std::cout << "Starting" << std::endl;
auto t1 = std::chrono::high_resolution_clock::now();
{
for (int total = 0; total < TAM; total++) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
output[i] += (values[j] + error[j]) / i + 1;
}
}
}
}
auto t2 = std::chrono::high_resolution_clock::now();
auto duration =
(std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1)
.count());
float time = (float)duration / 1000000;
std::cout << "Processing time = " << time << " seconds."
<< std::endl;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论