为什么 gcc 不删除对非易失性变量的检查?
这个问题主要是学术性的。我出于好奇而问,并不是因为这给我带来了实际问题。
考虑以下不正确的 C 程序。
#include <signal.h>
#include <stdio.h>
static int running = 1;
void handler(int u) {
running = 0;
}
int main() {
signal(SIGTERM, handler);
while (running)
;
printf("Bye!\n");
return 0;
}
该程序不正确,因为处理程序中断了程序流程,因此可以随时修改running
,因此应将其声明为易失性
。但假设程序员忘记了这一点。
gcc 4.3.3 带有 -O3
标志,将循环体(在对 running
标志进行一次初始检查之后)编译为
.L7:
jmp .L7
预期的无限循环。
现在,我们在 while
循环中放入一些琐碎的内容,例如:
while (running)
putchar('.');
突然间,gcc 不再优化循环条件了!循环体的程序集现在看起来像这样(同样在 -O3
处):
.L7:
movq stdout(%rip), %rsi
movl $46, %edi
call _IO_putc
movl running(%rip), %eax
testl %eax, %eax
jne .L7
我们看到 running
每次通过循环都会从内存中重新加载;它甚至没有缓存在寄存器中。显然 gcc 现在认为 running
的值可能已经改变。
那么为什么在这种情况下 gcc 会突然决定需要重新检查 running
的值呢?
This question is mostly academic. I ask out of curiosity, not because this poses an actual problem for me.
Consider the following incorrect C program.
#include <signal.h>
#include <stdio.h>
static int running = 1;
void handler(int u) {
running = 0;
}
int main() {
signal(SIGTERM, handler);
while (running)
;
printf("Bye!\n");
return 0;
}
This program is incorrect because the handler interrupts the program flow, so running
can be modified at any time and should therefore be declared volatile
. But let's say the programmer forgot that.
gcc 4.3.3, with the -O3
flag, compiles the loop body (after one initial check of the running
flag) down to the infinite loop
.L7:
jmp .L7
which was to be expected.
Now we put something trivial inside the while
loop, like:
while (running)
putchar('.');
And suddenly, gcc does not optimize the loop condition anymore! The loop body's assembly now looks like this (again at -O3
):
.L7:
movq stdout(%rip), %rsi
movl $46, %edi
call _IO_putc
movl running(%rip), %eax
testl %eax, %eax
jne .L7
We see that running
is re-loaded from memory each time through the loop; it is not even cached in a register. Apparently gcc now thinks that the value of running
could have changed.
So why does gcc suddenly decide that it needs to re-check the value of running
in this case?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
在一般情况下,编译器很难准确地知道函数可能有权访问哪些对象,因此可能会修改哪些对象。在调用
putchar()
时,GCC 不知道是否有一个putchar()
实现可以修改running 因此它必须有点悲观,并假设
running
实际上可能已被更改。例如,翻译单元中稍后可能有一个
putchar()
实现:即使翻译单元中没有
putchar()
实现,也可能存在以下内容:例如,可能会传递running
对象的地址,以便putchar
可以修改它:请注意,您的
handler()
函数是全局可访问的,因此 putchar() 可能会调用 handler() 本身(直接或以其他方式),这是上述情况的一个实例。<罢工>
另一方面,由于
running
仅对翻译单元可见(static
),因此当编译器到达文件末尾时,它应该能够确定 putchar() 没有机会访问它(假设是这种情况),编译器可以返回并“修复” while 循环中的悲观化。由于
running
是静态的,编译器可能能够确定它无法从翻译单元外部访问,并进行您正在讨论的优化。但是,由于它可以通过handler()
访问,并且handler()
可以从外部访问,因此编译器无法优化访问。即使您将 handler() 设为静态,它也可以从外部访问,因为您将其地址传递给另一个函数。请注意,在您的第一个示例中,即使我在上一段中提到的内容仍然正确,编译器也可以优化对
running
的访问,因为 C 语言所基于的“抽象机器模型”不会这样做。除非在非常有限的情况下,否则不要考虑异步活动(其中一个是 volatile 关键字,另一个是信号处理,尽管信号处理的要求不足以阻止编译器能够优化第一个示例中对running
的访问)。事实上,C99 在几乎这些具体情况下描述了抽象机器行为:
最后,您应该注意 C99 标准还规定:
所以严格来说
running
变量可能需要声明为:In the general case it's difficult for a compiler to know exactly which objects a function might have access to and therefore could potentially modify. At the point where
putchar()
is called, GCC doesn't know if there might be aputchar()
implementation that might be able to modifyrunning
so it has to be somewhat pessimistic and assume thatrunning
might in fact have been changed.For example, there might be a
putchar()
implementation later in the translation unit:Even if there's not a
putchar()
implementation in the translation unit, there could be something that might, for example, pass the address of therunning
object such thatputchar
might be able to modify it:Note that your
handler()
function is globally accessible, soputchar()
might callhandler()
itself (directly or otherwise), which is an instance of the above situation.On the other hand, since
running
is visible only to the translational unit (beingstatic
), by the time the compiler gets to the end of the file it should be able to determine that there is no opportunity forputchar()
to access it (assuming that's the case), and the compiler could go back and 'fix up' the pessimization in the while loop.Since
running
is static, the compiler might be able to determine that it's not accessible from outside the translation unit and make the optimization you're talking about. However, since it's accessible throughhandler()
andhandler()
is accessible externally, the compiler can't optimize the access away. Even if you makehandler()
static, it's accessible externally since you pass the address of it to another function.Note that in your first example, even though what I mentioned in the above paragraph is still true the compiler can optimize away the access to
running
because the 'abstract machine model' the C language is based on doesn't take into account asynchronous activity except in very limited circumstances (one of which is thevolatile
keyword and another is signal handling, though the requirements of the signal handling aren't strong enough to prevent the compiler being able to optimize away the access torunning
in your first example).In fact, here's something the C99 says about the abstract machine behavior in pretty much these exact circumstances:
Finally, you should note that the C99 standard also says:
So strictly speaking the
running
variable may need to be declared as:因为调用
putchar()
可能会改变running
的值(GCC只知道putchar()
是外部函数而不知道它的作用 - 对于所有 GCC 都知道putchar()
可以调用handler()
)。Because the call to
putchar()
could change the value ofrunning
(GCC only knows thatputchar()
is an external function and does not know what it does - for all GCC knowsputchar()
could callhandler()
).GCC 可能假设对 putchar 的调用可以修改任何全局变量,包括 running。
看一下 pure 函数属性,声明该函数对全局状态没有副作用。我怀疑如果您用对“纯”函数的调用替换 putchar(),GCC 将重新引入循环优化。
GCC probably assumes that the call to
putchar
can modify any global variable, includingrunning
.Take a look at the pure function attribute, which states that the function does not have side-effects on the global state. I suspect if you replace putchar() with a call to a "pure" function, GCC will reintroduce the loop optimization.
谢谢大家的回答和评论。他们非常有帮助,但没有一个提供完整的故事。 [编辑:迈克尔·伯尔的回答现在确实如此,这使得这有点多余。]我将在这里总结。
即使
running
是静态的,handler
也不是静态的;因此它可能会从putchar
调用并以这种方式更改running
。由于此时putchar
的实现尚不清楚,因此可以想象它可以从while
循环体调用handler
。假设
handler
是静态的。那么我们可以优化掉running
检查吗?答案是否定的,因为signal
实现也在这个编译单元之外。据 gcc 所知,signal
可能会将handle
的地址存储在某处(事实上,它确实如此),然后putchar
可能会调用< code>handler 通过此指针,即使它无法直接访问该函数。那么在什么情况下可以优化掉
运行
检查呢?似乎只有当循环体不从该翻译单元外部调用任何函数时,这才是可能的,以便在编译时知道循环体内部发生和不发生的情况。这解释了为什么忘记
易失性
在实践中并不像乍看起来那么大。Thank you all for your answers and comments. They have been very helpful, but none of them provide the full story. [Edit: Michael Burr's answer now does, making this somewhat redundant.] I'll sum up here.
Even though
running
is static,handler
is not static; therefore it might be called fromputchar
and changerunning
in that way. Since the implementation ofputchar
is not known at this point, it could conceivably callhandler
from the body of thewhile
loop.Suppose
handler
were static. Can we optimize away therunning
check then? The answer is no, because thesignal
implementation is also outside this compilation unit. For all gcc knows,signal
might store the address ofhandle
somewhere (which, in fact, it does), andputchar
might then callhandler
through this pointer even though it has no direct access to that function.So in what cases can the
running
check be optimized away? It seems that this is only possible if the loop body does not call any functions from outside this translation unit, so that it is known at compilation time what does and does not happen inside the loop body.This explains why forgetting a
volatile
is not such a big deal in practice as it might seem at first.putchar
可以更改运行
。理论上,只有链接时分析才能确定事实并非如此。
putchar
can changerunning
.Only link-time analysis could, in theory, determine that it doesn't.