C++代码在没有优化的情况下被奇怪地跳过。有什么想法吗?
我寻找了两天的答案但没有成功。我以前从未遇到过这个问题,所以我会尽力而为。请耐心听我说。
我回到一年多前创建的 C++ 项目,当时运行没有问题。前几天,当我试图运行相同的程序时,我遇到了这个有趣且令人难以置信的恼人问题。代码类似于:
file.h
...
short id;
...
file.cc
id = 0;
while (id < some_large_number)
{
id = foo();
if (id == 2)
{
//do something
}
else if (id == 2900)
{
//do something
}
else if (id == 30000)
{
//do something
}
else if (id == 40000)
{
//do something
}
else if (id == 45000)
{
//do something
}
else
{
//do something else
}
}
常量是我在本示例中扩展的十六进制表示法的宏。事实证明,这确实是一个错误,但调试器并没有让它很容易被发现。发生的事情如下:
当我尝试使用 GDB(没有优化)单步执行代码时,我注意到 GDB 每次在到达 if (id == 30000)
后都会直接跳转到 else 语句。因为这些数字是十六进制表示法的 C 宏,所以我一开始并没有注意到 40000
超出了 signed Short
的限制。这是非常具有误导性的,我花了几个小时试图弄清楚:我重新编译了外部库,重新安装了 g++,等等。
显然,将 id
设为 unsigned Short
解决了问题。另一个问题似乎是编译器问题。但我还是不明白,为什么这些代码段在执行过程中完全被跳过,并且没有任何优化?为什么它不遍历每个 if
语句,这样我就可以识别真正的问题?有什么想法吗?
非常感谢。我希望这对于第一个问题来说是可以的。
I looked for an answer for two days to this with no success. I've never come across this problem before so I'll try my best. Please bear with me.
I returned to a C++ project I created over a year ago, which at the time ran without problems. I came across this interesting and incredibly annoying problem the other day as I was trying to get the same program to run. The code was something like:
file.h
...
short id;
...
file.cc
id = 0;
while (id < some_large_number)
{
id = foo();
if (id == 2)
{
//do something
}
else if (id == 2900)
{
//do something
}
else if (id == 30000)
{
//do something
}
else if (id == 40000)
{
//do something
}
else if (id == 45000)
{
//do something
}
else
{
//do something else
}
}
The constant numbers were macros in hex notation that I expanded for this example. Turns out that this was truly a bug, but the debugger did not make it easy to discover. Heres what happened:
As I was trying to step through the code using GDB (with no optimizations), I noticed that GDB would jump straight to the else statement after reaching if (id == 30000)
, everytime. Because the numbers were c macros in hex notation, I did not notice at first that 40000
was beyond the limit of a signed short
. This was very misleading, and spent hours trying to figure it out: I recompiled external libraries, reinstalled g++, among other things.
Obviously, making id
an unsigned short
fixed the problem. The other problem seems like a compiler issue. But I still don't understand, why were those sections of code completely skipped during execution, and with no optimizations? Why would it not go through each if
statement and that way I could identify the real problem? Any ideas?
Thanks so much. I hope this is okay for a first question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您启用 gcc 的所有警告,它会在编译时告诉您将会发生这种情况。
If you enable all the warnings from gcc, it will tell you at compile time that this is going to happen.
Short 是 16 位长,其范围是 -32768 到 32767。因此它永远不会是 40000 或 45000,并且编译器消除了死代码(因为它永远不会达到)。
short is 16 bits long and its range is -32768 to 32767. So it can never be 40000 or 45000 and compiler eliminated dead code (as it will never be reached).
GCC 是一个优秀的优化编译器,但是即使通过 -Werror、-Wall 等启用错误和警告信息,GCC 仍然不会产生与诊断编译相同级别的信息。在开发代码时,我建议使用 Clang(一种诊断编译器)来帮助查找错误和错误。 Clang 旨在与 GCC 兼容,除了一些更深奥的功能之外,我在 Makefile 中的两者之间更改 CC 没有任何问题。
作为一个优化编译器,我相信 GCC 默认情况下会消除死代码。这将导致编译器检测到不可能的所有分支(例如 id 变量范围之外的分支)被消除。您也许可以禁用这种类型的死代码消除。
GCC is an excellent optimizing compiler, however even when error and warning information is enabled via -Werror, -Wall, etc GCC still doesn't produce the same level of information a diagnostic compile does. While developing the code I would recommend using Clang, a diagnostic compiler, to help in finding bugs and error. Clang is intended to compatible with GCC, and with the exception of some more esoteric features I have had no problem changing my CC between the two in my Makefile.
Being an optimizing compiler, I believe GCC by default enable dead-code elimination. This would cause all branches which the compiler detected impossible, such as those outside the bounds of your id variable, to be eliminated. You might be able to disable that type of dead-code elimination.
回想一下,在编译过程中,C++ 编译器会检查代码并确定代码的哪些部分按什么顺序执行。如果编译器确定部分代码永远不会运行,它就不会对其进行优化。
例如:
这里,没有理由优化 if 语句,因为它永远不会执行。
如果您使用以下命令进行编译,则会拾取此类实例:
g++ -Wall mycode.c
,其中
-Wall
表示显示所有警告,mycode.c
表示显示所有警告。 code> 是您的项目文件。至于执行,通过GDB单步显示程序当前的流程。如果一个分支(在 if 语句中)为假,为什么它会遍历该代码部分?在 if-elseif-else 语句中只能采用一个分支。
我希望这对你有帮助。
Recall that during compiling, the C++ compiler looks through the code and determines what parts of the code execute in what order. If the compiler determines that a part of the code is never going to run, it will not optimize it.
For example:
Here, there is no reason to optimize the if-statement, as it will never execute.
This sort of instance would be picked up if you compile with:
g++ -Wall mycode.c
where
-Wall
means show all warnings, andmycode.c
is your project file.As for execution, stepping through GDB shows the current flow of the program. If a branch (in an if-statement) is false, why would it ever go through that section of the code? You can only take one branch in an if-elseif-else statement.
I hope that helps you out.
我的结论和你的一样:即使没有打开“优化”,它似乎也被优化了。也许这些常量谓词“始终为真”/“始终为假”用于直接在代码生成步骤中的某个位置跳过代码,即比执行任何 -O 开关优化要早得多。只是一个猜测。
My conclusion is the same as yours: It seems like it was optimized out even without "optimizations" turned on. Maybe these constant predicates "always true"/"always false" are used to skip the code somewhere directly in the code generation step, i.e. way sooner than any -O switch optimizations are being performed. Just a guess.