布尔操作数的短路没有副作用
对于赏金:如何在不禁用或降低优化级别的情况下禁用此行为?
以下条件表达式是在 MinGW GCC 上编译的3.4.5,其中 a
是 signed long
类型,m
是 unsigned long
类型。
if (!a && m > 0x002 && m < 0x111)
使用的CFLAGS
是-g -O2
。下面是相应的程序集 GCC 输出(使用 objdump
转储)
120: 8b 5d d0 mov ebx,DWORD PTR [ebp-0x30]
123: 85 db test ebx,ebx
125: 0f 94 c0 sete al
128: 31 d2 xor edx,edx
12a: 83 7d d4 02 cmp DWORD PTR [ebp-0x2c],0x2
12e: 0f 97 c2 seta dl
131: 85 c2 test edx,eax
133: 0f 84 1e 01 00 00 je 257 <_MyFunction+0x227>
139: 81 7d d4 10 01 00 00 cmp DWORD PTR [ebp-0x2c],0x110
140: 0f 87 11 01 00 00 ja 257 <_MyFunction+0x227>
120
-131
可以很容易地追踪到第一次评估 !a,然后是
m > 的计算0x002
。第一个跳转条件直到 133
才会发生。此时,无论第一个表达式的结果如何:!a
,两个 表达式都已被求值。如果a
等于0,则表达式可以(并且应该)立即结束,但这里没有这样做。
这与 C 标准有何关系? C 标准要求布尔运算符在确定结果后立即短路?
For the bounty: How can this behavior can be disabled on a case-by-case basis without disabling or lowering the optimization level?
The following conditional expression was compiled on MinGW GCC 3.4.5, where a
is a of type signed long
, and m
is of type unsigned long
.
if (!a && m > 0x002 && m < 0x111)
The CFLAGS
used were -g -O2
. Here is the corresponding assembly GCC output (dumped with objdump
)
120: 8b 5d d0 mov ebx,DWORD PTR [ebp-0x30]
123: 85 db test ebx,ebx
125: 0f 94 c0 sete al
128: 31 d2 xor edx,edx
12a: 83 7d d4 02 cmp DWORD PTR [ebp-0x2c],0x2
12e: 0f 97 c2 seta dl
131: 85 c2 test edx,eax
133: 0f 84 1e 01 00 00 je 257 <_MyFunction+0x227>
139: 81 7d d4 10 01 00 00 cmp DWORD PTR [ebp-0x2c],0x110
140: 0f 87 11 01 00 00 ja 257 <_MyFunction+0x227>
120
-131
can easily be traced as first evaluating !a
, followed by the evaluation of m > 0x002
. The first jump conditional does not occur until 133
. By this time, two expressions have been evaluated, regardless of the outcome of the first expression: !a
. If a
was equal to zero, the expression can (and should) be concluded immediately, which is not done here.
How does this relate to the the C standard, which requires Boolean operators to short-circuit as soon as the outcome can be determined?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
C标准仅规定了“抽象机器”的行为;它没有指定程序集的生成。只要程序的可观察行为与抽象机上的行为相匹配,实现就可以使用它喜欢的任何物理机制来实现语言构造。标准 (C99) 中的相关部分是 5.1.2.3 程序执行。
The C standard only specifies the behavior of an "abstract machine"; it does not specify the generation of assembly. As long as the observable behavior of a program matches that on the abstract machine, the implementation can use whatever physical mechanism it likes for implementing the language constructs. The relevant section in the standard (C99) is 5.1.2.3 Program execution.
这可能是编译器优化,因为比较整数类型没有副作用。您可以尝试在不进行优化的情况下进行编译,或者使用具有副作用的函数而不是比较运算符,然后看看它是否仍然会出现这种情况。
例如,尝试
它应该打印
ac
It is probably a compiler optimization since comparing integral types has no side effects. You could try compiling without optimizations or using a function that has side effects instead of the comparison operator and see if it still does this.
For example, try
and it should print
ac
正如其他人提到的,此汇编输出是编译器优化,不会影响程序执行(据编译器所知)。如果您想有选择地禁用此优化,则需要告诉编译器您的变量不应跨代码中的序列点进行优化。
序列点是控制表达式(
if
、switch
、while
中的计算,do
以及for
的所有三个部分)、逻辑 OR 和 AND、条件语句 (?:
)、逗号和return声明。
为了防止编译器在这些点上进行优化,您必须声明变量
易失性
。在您的示例中,您可以指定此方法有效的原因是
volatile
用于指示编译器它无法预测等效机器相对于变量。因此,它必须严格遵守代码中的序列点。As others have mentioned, this assembly output is a compiler optimization that doesn't affect program execution (as far as the compiler can tell). If you want to selectively disable this optimization, you need to tell the compiler that your variables should not be optimized across the sequence points in the code.
Sequence points are control expressions (the evaluations in
if
,switch
,while
,do
and all three sections offor
), logical ORs and ANDs, conditionals (?:
), commas and thereturn
statement.To prevent compiler optimization across these points, you must declare your variable
volatile
. In your example, you can specifyThe reason that this works is that
volatile
is used to instruct the compiler that it can't predict the behavior of an equivalent machine with respect to the variable. Therefore, it must strictly obey the sequence points in your code.编译器的优化 - 它将结果放入 EBX,将其移至 AL(EAX 的一部分),对 EDX 进行第二次检查,然后根据 EAX 和 EDX 的比较进行分支。这节省了一个分支并使代码运行得更快,并且在副作用方面没有任何区别。
如果您使用
-O0
而不是-O2
进行编译,我想它会产生更简单的程序集,更符合您的期望。The compiler's optimising - it gets the result into EBX, moves it to AL, part of EAX, does the second check into EDX, then branches based on the comparison of EAX and EDX. This saves a branch and leaves the code running faster, without making any difference at all in terms of side effects.
If you compile with
-O0
rather than-O2
, I imagine it will produce more naive assembly that more closely matches your expectations.无论哪种方式,代码都表现正确(即,符合语言标准的要求)。
您似乎正在尝试找到一种生成特定汇编代码的方法。在两种可能的汇编代码序列中,两者的行为方式相同,您会发现一种令人满意,另一种则不满意。
保证令人满意的汇编代码序列的唯一真正可靠的方法是显式编写汇编代码。 gcc 确实支持内联汇编。
C 代码指定行为。汇编代码指定机器代码。
但这一切都提出了一个问题:为什么这对你很重要? (我并不是说它不应该,我只是不明白为什么它应该。)
编辑:
a
和m
究竟是怎样的> 定义?如果,正如您所建议的,它们与内存映射设备相关,那么它们应该被声明为易失性
——这可能正是您问题的解决方案。如果它们只是普通变量,那么编译器可以对它们做任何它喜欢做的事情(只要它不影响程序的可见行为)因为你没有要求它不这样做。The code is behaving correctly (i.e., in accordance with the requirements of the language standard) either way.
It appears that you're trying to find a way to generate specific assembly code. Of two possible assembly code sequences, both of which behave the same way, you find one satisfactory and the other unsatisfactory.
The only really reliable way to guarantee the satisfactory assembly code sequence is to write the assembly code explicitly. gcc does support inline assembly.
C code specifies behavior. Assembly code specifies machine code.
But all this raises the question: why does it matter to you? (I'm not saying it shouldn't, I just don't understand why it should.)
EDIT: How exactly are
a
andm
defined? If, as you suggest, they're related to memory-mapped devices, then they should be declaredvolatile
-- and that might be exactly the solution to your problem. If they're just ordinary variables, then the compiler can do whatever it likes with them (as long as it doesn't affect the program's visible behavior) because you didn't ask it not to.