“未初始化使用” g++ 中的警告编译器
我使用警告级别为 -Wall -Wextra
的 g++,并将警告视为错误 (-Werror
)。
现在我有时会收到错误“变量可能在此函数中未初始化地使用”。
我所说的“有时”是指我有两个独立的编译单元,它们都包含相同的头文件。一个编译单元编译没有错误,另一个编译单元给出上述错误。
头文件中的相关代码如下。由于该函数相当长,我只在下面重现了相关部分。
确切的错误是:
“cmpres”在此函数中可能未初始化即可使用
,我已在下面用 *
标记了带有错误的行。
for (; ;) {
int cmpres; // *
while (b <= c and (cmpres = cmp(b, pivot)) <= 0) {
if (cmpres == 0)
::std::iter_swap(a++, b);
++b;
}
while (c >= b and (cmpres = cmp(c, pivot)) >= 0) {
if (cmpres == 0)
::std::iter_swap(d--, c);
--c;
}
if (b > c) break;
::std::iter_swap(b++, c--);
}
(cmp
是一个函子,它接受两个指针 x
和 y
,如果 *x
*x
,则返回 –1、0 或 +1。 *y
、*x == *y
或 *x > *y
分别是指向同一数组的指针。
) code 是一个更大函数的一部分,但变量 cmpres
没有在其他地方使用。因此我无法理解为什么会产生这个警告。此外,编译器显然理解,cmpres
永远不会在未初始化的情况下被读取(或者至少,它并不总是发出警告,见上文)。
现在我有两个问题:
为什么行为不一致?此警告是由启发式生成的吗? (这是合理的,因为发出此警告需要进行控制流分析,而控制流分析在一般情况下是 NP 困难的,并且不能总是执行。)
为什么会出现警告? 是我的代码不安全?我开始欣赏这个特殊的警告,因为它使我免于在其他情况下很难检测到错误 - 所以这个是一个有效的警告,至少有时是这样。这里有效吗?
I’m using g++ with warning level -Wall -Wextra
and treating warnings as errors (-Werror
).
Now I’m sometimes getting an error “variable may be used uninitialized in this function”.
By “sometimes” I mean that I have two independent compilation units that both include the same header file. One compilation unit compiles without error, the other gives the above error.
The relevant piece of code in the header files is as follows. Since the function is pretty long, I’ve only reproduced the relevant bit below.
The exact error is:
'cmpres' may be used uninitialized in this function
And I’ve marked the line with the error by *
below.
for (; ;) {
int cmpres; // *
while (b <= c and (cmpres = cmp(b, pivot)) <= 0) {
if (cmpres == 0)
::std::iter_swap(a++, b);
++b;
}
while (c >= b and (cmpres = cmp(c, pivot)) >= 0) {
if (cmpres == 0)
::std::iter_swap(d--, c);
--c;
}
if (b > c) break;
::std::iter_swap(b++, c--);
}
(cmp
is a functor that takes two pointers x
and y
and returns –1, 0 or +1 if *x < *y
, *x == *y
or *x > *y
respectively. The other variables are pointers into the same array.)
This piece of code is part of a larger function but the variable cmpres
is used nowhere else. Hence I fail to understand why this warning is generated. Furthermore, the compiler obviously understands that cmpres
will never be read uninitialized (or at least, it doesn’t always warn, see above).
Now I have two questions:
Why the inconsistent behaviour? Is this warning generated by a heuristic? (This is plausible since emitting this warning requires a control flow analysis which is NP hard in the general case and cannot always be performed.)
Why the warning? Is my code unsafe? I have come to appreciate this particular warning because it has saved me from very hard to detect bugs in other cases – so this is a valid warning, at least sometimes. Is it valid here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
诊断未初始化变量且没有误报或误报的算法必须(作为子例程)包含解决停止问题的算法问题。这意味着不存在这样的算法。计算机不可能百分百正确地做到这一点。
我不知道 GCC 的未初始化变量分析到底是如何工作的,但我确实知道它对早期优化过程对代码所做的事情非常敏感。所以我一点也不惊讶你只是有时会得到误报。它确实区分了确定的情况和不能确定的情况——
产生“警告:'a'在此函数中未初始化使用”(强调我的)。
编辑:我发现最新版本的 GCC(4.3 及更高版本)无法诊断未初始化的变量:
早期优化注意到如果
x
不为零,函数的行为未定义,因此他们假设x
必须为零并将函数的整个主体替换为“return 0;
“这种情况发生在生成使用未初始化警告的传递之前,因此没有诊断。请参阅 GCC bug 18501 了解详细信息。我提出这一点的部分原因是为了证明生产级编译器可能会以两种方式得到错误的未初始化变量诊断,部分原因是这是一个很好的例子,说明未定义的行为可以在执行时间中向后传播。测试
x
没有任何未定义的内容,但由于依赖于x
的代码控制具有未定义的行为,因此允许编译器假设控制依赖关系永远不会得到满足并放弃测试。An algorithm that diagnoses uninitialized variables with no false negatives or positives must (as a subroutine) include an algorithm that solves the Halting Problem. Which means there is no such algorithm. It is impossible for a computer to get this right 100% of the time.
I don't know how GCC's uninitialized variable analysis works exactly, but I do know it's very sensitive to what early optimization passes have done to the code. So I'm not at all surprised you get false positives only sometimes. It does distinguish cases where it's certain from cases where it can't be certain --
produces "warning: ‘a’ is used uninitialized in this function" (emphasis mine).
EDIT: I found a case where recent versions of GCC (4.3 and later) fail to diagnose an uninitialized variable:
Early optimizations notice that if
x
is nonzero, the function's behavior is undefined, so they assumex
must be zero and replace the entire body of the function with "return 0;
" This happens well before the pass that generates the used-uninitialized warnings, so there's no diagnostic. See GCC bug 18501 for gory details.I bring this up partially to demonstrate that production-grade compilers can get uninitialized-variable diagnostics wrong both ways, and partially because it's a nice example of the point that undefined behavior can propagate backward in execution time. There's nothing undefined about testing
x
, but because code control-dependent onx
has undefined behavior, a compiler is allowed to assume that the control dependency is never satisfied and discard the test.本周,在 clang dev 邮件列表上进行了与这些启发式相关的有趣讨论。
底线是:实际上很难在不获得指数行为的情况下诊断单位化值...
显然(从讨论中),gcc 使用谓词基方法,但根据您的经验,它似乎并不总是足够的。
我怀疑这与分配在条件中混合的事实有关(并且在短路运算符之后......)。你试过没有吗?
我认为 gcc 和 clang 人员都会对这个示例非常感兴趣,因为它是 C 或 C++ 中相对常见的做法,因此可以从一些调整中受益。
There was an interesting discussion on clang dev-mailing list related to those heuristics this week.
The bottom line is: it's actually quite difficult to diagnose unitialized values without getting exponential behavior...
Apparently (from the discussion), gcc uses a predicate base approach, but given your experience it seems that it is not always sufficient.
I suspect it's got something to do with the fact that the assignment is mixed within the condition (and after a short-circuiting operator at that...). Have you tried without ?
I think both the gcc and clang folks would be very interested by this example since it's relatively common practice in C or C++ and thus could benefit from some tuning.
代码是正确的,但编译器无法识别该变量在未经初始化的情况下从未使用过。
The code is correct, but the compiler is failing to identify that the variable is never used without initialization.
我认为这可能是一个启发式错误——这就是“可能”的用途。我怀疑没有多少循环条件看起来像这样。该代码并非不安全,因为在所有控制路径中,cmpres 在使用前都已分配。不过,我当然不会认为先初始化它是错误的。
但是,您可以在这里进行某种变量阴影。对于两个翻译单元中只有一个出现错误,这是我能想到的唯一解释。
I would suggest that it's likely a heuristical error- that's what the "may" is for. I suspect that not many loop conditions look quite like that. That code is not unsafe because in all control paths, cmpres is assigned before use. However, I certainly wouldn't find it wrong to initialize it first.
You could, however, have some kind of variable shadowing going on here. That would be the only explanation I could think of for only one of the two translation units giving errors.