不同的优化级别会导致功能不同的代码吗?

发布于 2024-11-15 18:30:01 字数 538 浏览 2 评论 0原文

我对编译器在优化时拥有的自由感到好奇。让我们将这个问题限制在 GCC 和 C/C++(任何版本、任何风格的标准)上:

是否可以编写根据编译时的优化级别而表现出不同行为的代码?

我想到的例子是在 C++ 的各种构造函数中打印不同的文本位,并根据副本是否被省略而获得差异(尽管我无法使这样的事情发挥作用)。

不允许计算时钟周期。如果您有非 GCC 编译器的示例,我也会很好奇,但我无法检查它。 C 语言示例的加分项。:-)

编辑: 示例代码应该符合标准,并且从一开始就不包含未定义的行为。

编辑2:已经得到了一些很好的答案!让我提高一点赌注:代码必须构成格式良好的程序并且符合标准,并且它必须在每个优化级别编译为正确的、确定性的程序。 (这不包括格式不正确的多线程代码中的竞争条件之类的内容。)我也意识到浮点舍入可能会受到影响,但让我们忽略这一点。

我刚刚达到 800 声望,所以我想我应该在第一个完整的例子上奖励 50 声望,以符合这些条件的(精神); 25 如果涉及滥用严格别名。 (取决于有人向我展示如何向其他人发送赏金。)

I am curious about the liberties that a compiler has when optimizing. Let's limit this question to GCC and C/C++ (any version, any flavour of standard):

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

The example I have in mind is printing different bits of text in various constructors in C++ and getting a difference depending on whether copies are elided (though I've not been able to make such a thing work).

Counting clock cycles is not permitted. If you have an example for a non-GCC compiler, I'd be curious, too, but I can't check it. Bonus points for an example in C. :-)

Edit: The example code should be standard compliant and not contain undefined behaviour from the outset.

Edit 2: Got some great answers already! Let me up the stakes a bit: The code must constitute a well-formed program and be standards-compliant, and it must compile to correct, deterministic programs in every optimization level. (That excludes things like race-conditions in ill-formed multithreaded code.) Also I appreciate that floating point rounding may be affected, but let's discount that.

I just hit 800 reputation, so I think I shall blow 50 reputation as bounty on the first complete example to conform to (the spirit) of those conditions; 25 if it involves abusing strict aliasing. (Subject to someone showing me how to send bounty to someone else.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

一腔孤↑勇 2024-11-22 18:30:01

适用的 C++ 标准部分是第 1.9 节“程序执行”。其部分内容如下:

需要一致的实现来模拟(仅)抽象机的可观察行为,如下所述。 ...

执行格式良好的程序的一致实现应产生与具有相同程序和相同输入的抽象机的相应实例的可能执行序列之一相同的可观察行为。 ...

抽象机的可观察行为是其对易失性数据的读取和写入以及对库 I/O 函数的调用的序列。 ...

所以,是的,代码在不同的优化级别表现可能不同,但是(假设所有级别都产生一致的编译器),但它们的表现不能明显不同

编辑:请允许我纠正我的结论:是的,只要每个行为与标准抽象机的行为之一明显相同,代码在不同的优化级别上可能会有不同的行为。

The portion of the C++ standard that applies is §1.9 "Program execution". It reads, in part:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. ...

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. ...

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions. ...

So, yes, code may behave differently at different optimization levels, but (assuming that all levels produce a conforming compiler), but they cannot behave observably differently.

EDIT: Allow me to correct my conclusion: Yes, code may behave differently at different optimization levels as long as each behavior is observably identical to one of the behaviors of the standard's abstract machine.

不甘平庸 2024-11-22 18:30:01

浮点计算是产生差异的一个成熟来源。根据各个操作的排序方式,您可以获得更多/更少的舍入误差。

不安全的多线程代码也可能会产生不同的结果,具体取决于内存访问的优化方式,但这本质上是代码中的错误。

正如您所提到的,当优化级别发生变化时,复制构造函数中的副作用可能会消失。

Floating point calculations are a ripe source for differences. Depending on how the individual operations are ordered, you can get more/less rounding errors.

Less than safe multi-threaded code can also have different results depending on how memory accesses are optimized, but that's essentially a bug in your code anyhow.

And as you mentioned, side effects in copy constructors can vanish when optimization levels change.

浅笑依然 2024-11-22 18:30:01

是否可以编写这样的代码
行为有所不同,具体取决于哪个
编译的优化级别
与?

仅当您触发编译器的错误时。

编辑

此示例在 gcc 4.5.2 上的行为有所不同:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

使用 -O0 编译会导致程序因分段错误而崩溃。
使用-O2编译会创建一个进入无限循环的程序。

Is it possible to write code which
behaves differently depending on which
optimization level it was compiled
with?

Only if you trigger a compiler's bug.

EDIT

This example behaves differently on gcc 4.5.2:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

Compiled with -O0 creates a program crashing with a segmentation fault.
Compiled with -O2 creates a program entering an endless loop.

余生再见 2024-11-22 18:30:01

好吧,我通过提供一个具体的例子来明目张胆地争取赏金。我将把其他人的答案和我的评论放在一起。

为了不同优化级别的不同行为,“优化级别 A”应表示 gcc -O0(我使用的是 4.3.4 版本,但这并不重要,我认为任何甚至隐约最近的版本将显示我所追求的差异),“优化级别 B”应表示 gcc -O0 -fno-elide-constructors 。

代码很简单:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

优化级别 A 的输出:

~Foo

优化级别 B 的输出:

~Foo
~Foo

代码完全合法,但由于复制构造函数省略,输出依赖于实现,特别是它对禁用复制构造函数省略的 gcc 优化标志敏感。

请注意,一般来说,“优化”是指可以改变未定义、未指定或实现定义的行为的编译器转换,但不能改变标准定义的行为。因此,满足您的标准的任何示例都必然是其输出未指定或实现定义的程序。在这种情况下,标准未指定是否删除复制因子,我只是很幸运,GCC 在允许的情况下可靠地删除了它们,但有一个选项可以禁用它。

OK, my flagrant play for the bounty, by providing a concrete example. I'll put together the bits from other people's answers and my comments.

For the purpose of different behaviour at different optimizations levels, "optimization level A" shall denote gcc -O0 (I'm using version 4.3.4, but it doesn't matter much, I think any even vaguely recent version will show the difference I'm after), and "optimization level B" shall denote gcc -O0 -fno-elide-constructors.

Code is simple:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

Output at optimization level A:

~Foo

Output at optimization level B:

~Foo
~Foo

The code is totally legal, but the output is implementation-dependent because of copy constructor elision, and in particular it's sensitive to gcc's optimization flag that disables copy ctor elision.

Note that generally speaking, "optimization" refers to compiler transformations that can alter behavior that is undefined, unspecified or implementation-defined, but not behavior that is defined by the standard. So any example that satisfies your criteria necessarily is a program whose output is either unspecified or implementation-defined. In this case it's unspecified by the standard whether copy ctors are elided, I just happen to be lucky that GCC reliably elides them pretty much whenever allowed, but has an option to disable that.

夜声 2024-11-22 18:30:01

对于C来说,几乎所有操作都在抽象机中严格定义,并且只有当可观察的结果恰好是该抽象机的结果时才允许优化。我想到了该规则的例外情况:

  • 未定义的行为不一定是
    不同编译器之间一致
    错误代码
  • 浮点运算的运行或执行可能会导致
    函数调用的不同舍入
  • 参数可以是
    任何顺序表达式进行计算
  • 以具有 volatile 限定的
    类型可能会或可能不会被评估
    由于其副作用,
  • 相同的 const 限定复合文字可能会也可能不会折叠到一个静态内存位置

For C, almost all operations are strictly defined in the abstract machine and optimizations are only allowed if the observable result is exactly that of that abstract machine. Exceptions of that rule that come to mind:

  • undefined behavior don't has to be
    consistent between different compiler
    runs or executions of the faulty code
  • floating point operations may cause
    different rounding
  • arguments to function calls can be
    evaluated in any order
  • expressions with volatile qualified
    type may or may not be evaluated just
    for their side effects
  • identical const qualified compound literals may or may be not folded into one static memory location
月亮邮递员 2024-11-22 18:30:01

根据标准,任何未定义行为都可以根据优化级别(或月相)改变其行为。

Anything that is Undefined Behavior according to the standard can change its behavior depending on optimization level (or moon-phase, for that matter).

末蓝 2024-11-22 18:30:01

由于复制构造函数调用可以被优化掉,即使它们有副作用,因此具有副作用的复制构造函数将导致未优化和优化的代码表现不同。

Since copy constructor calls can be optimized away, even if they have side effects, having copy constructors with side-effects will cause unoptimized and optimized code to behave differently.

别把无礼当个性 2024-11-22 18:30:01

如果您有两个指向同一内存块的指针,则 -fstrict-aliasing 选项很容易导致行为发生变化。这应该是无效的,但实际上很常见。

The -fstrict-aliasing option can easily cause changes in behavior if you have two pointers to the same block of memory. This is supposed to be invalid but is actually quite common.

单身狗的梦 2024-11-22 18:30:01

此 C 程序调用未定义的行为,但在不同的优化级别中显示不同的结果:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}

This C program invokes undefined behavior, but does display different results in different optimization levels:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}
桃气十足 2024-11-22 18:30:01

当使用非零优化级别时,gcc 定义 __OPTIMIZE__ 宏。您可以像下面这样使用它:

#ifdef __OPTIMIZE__
printf("Code compiled with -O1 or higher\n");
#else
printf("Code compiled with -O0\n");
#endif

gcc defines __OPTIMIZE__ macro when non-zero optimization level is used. You can use it like below:

#ifdef __OPTIMIZE__
printf("Code compiled with -O1 or higher\n");
#else
printf("Code compiled with -O0\n");
#endif
愁杀 2024-11-22 18:30:01

相同的源代码,例如
源代码

启用 -finline-small-functions 之前和启用 -finline-small-functions 之后

启用-finline-small-functions 之前

启用后-finline-small-functions

-finline-small-functions 可以在 -O2/-O3 中启用

same source code like
source code

before enable -finline-small-functions and after enable -finline-small-functions

Before enable -finline-small-functions

After enable -finline-small-functions

-finline-small-functions can be enabled in -O2/-O3

沉默的熊 2024-11-22 18:30:01

两个不同的 C 程序:

foo6.c

void p2(void);

int main() {
    p2();
    return 0;
}

bar6.c

#include <stdio.h>

char main;

void p2() {
    printf("0x%x\n", main);
}

当两个模块都编译成一个可执行文件时
优化级别一和零,它们打印出两个不同的值。 -O1 为 0x48,-O0 为 0x55

终端屏幕截图

这是它在以下环境中工作的示例我的环境

Two different C programs:

foo6.c

void p2(void);

int main() {
    p2();
    return 0;
}

bar6.c

#include <stdio.h>

char main;

void p2() {
    printf("0x%x\n", main);
}

When both modules are compiled into one excecutable with
optimization levels one and zero, they print out two different values. 0x48 for -O1 and 0x55 for -O0

Screenshot of terminal

Here is an example of it working in my environment

烈酒灼喉 2024-11-22 18:30:01

ac:

char *f1(void) { return "hello"; }

bc:

#include <stdio.h>

char *f1(void);

int main()
{
    if (f1() == "hello") printf("yes\n");
        else printf("no\n");
}

输出取决于是否启用或禁用合并字符串常量优化:

$ gcc ac bc -oa -fno-merge-constants; ./a
没有
$ gcc ac bc -oa -fmerge-常量; ./a
是的

a.c:

char *f1(void) { return "hello"; }

b.c:

#include <stdio.h>

char *f1(void);

int main()
{
    if (f1() == "hello") printf("yes\n");
        else printf("no\n");
}

Output depends on whether merge string constants optimization is enabled or disabled:

$ gcc a.c b.c -o a -fno-merge-constants; ./a
no
$ gcc a.c b.c -o a -fmerge-constants; ./a
yes

混吃等死 2024-11-22 18:30:01

今天我的操作系统课程中有一些有趣的例子。
我们分析了一些软件互斥体,这些互斥体在优化时可能会被损坏,因为编译器不知道并行执行。

编译器可以对不操作依赖数据的语句进行重新排序。
正如我已经在并行代码中声明的那样,这种依赖关系对于编译器来说是隐藏的,因此它可能会中断。
我给出的示例会导致调试过程中出现一些困难,因为线程安全性被破坏,并且由于操作系统调度问题和并发访问错误,代码的行为变得不可预测。

Got some interesting example in my OS course today.
We analized some software mutex that could be damaged on optimization because the compiler does not know about the parallel execution.

The compiler can reorder statements that do not operate on dependent data.
As I already statet in parallelized code this dependencie is hidden for the compiler so it could break.
The example I gave would lead to some hard times in debugging as the threadsafety is broken and your code behaves unpredictable because of OS-scheduling issues and concurrent access errors.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文