gcc 疯狂优化级别(-O3)还不够疯狂吗?

发布于 2024-10-18 13:57:48 字数 975 浏览 1 评论 0原文

作为回答另一个问题的一部分,我想表明 gcc (-O3) 的疯狂优化级别基本上会删除 main 中未使用的任何变量。代码是:

#include <stdio.h>
int main (void) {
   char bing[71];
   int x = 7;
   bing[0] = 11;
   return 0;
}

并且 gcc -O3 输出是:

    .file "qq.c"
    .text
    .p2align 4,,15
.globl main
    .type main, @function
main:
    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret
    .size main, .-main
    .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section .note.GNU-stack,"",@progbits

现在我可以看到它已经删除了局部变量,但其中仍然存在相当多的浪费。在我看来,整个:

    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret

部分可以替换为更简单的:

    xorl %eax, %eax
    ret

有谁知道为什么gcc不执行此优化?我知道这对于 main 本身来说节省的很少,但是,如果这也用普通函数完成,那么在大规模循环中不必要地调整堆栈指针的影响将是相当大的。

用于生成程序集的命令是:

gcc -O3 -std=c99 -S qq.c

As part of answering another question, I wanted to show that the insane level of optimisation of gcc (-O3) would basically strip out any variables that weren't used in main. The code was:

#include <stdio.h>
int main (void) {
   char bing[71];
   int x = 7;
   bing[0] = 11;
   return 0;
}

and the gcc -O3 output was:

    .file "qq.c"
    .text
    .p2align 4,,15
.globl main
    .type main, @function
main:
    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret
    .size main, .-main
    .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section .note.GNU-stack,"",@progbits

Now I can see it's removed the local variables but there's still quite a bit of wastage in there. It seems to me that the entire:

    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret

section could be replaced with the simpler:

    xorl %eax, %eax
    ret

Does anyone have any idea why gcc does not perform this optimisation? I know that would save very little for main itself but, if this were done with normal functions as well, the effect of unnecessarily adjusting the stack pointer in a massive loop would be considerable.

The command used to generate the assembly was:

gcc -O3 -std=c99 -S qq.c

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

自由范儿 2024-10-25 13:57:48

您可以使用 -fomit-frame-pointer 编译器标志启用该特定优化。这样做会使某些机器上的调试变得不可能,而在其他机器上则变得更加困难,这就是它通常被禁用的原因。

尽管您的 GCC 文档可能会说 -fomit-frame-pointer 在各种优化级别上启用,但您可能会发现情况并非如此 - 您几乎肯定必须自己显式启用它。

You can enable that particular optimization with the -fomit-frame-pointer compiler flag. Doing so makes debugging impossible on some machines and substantially more difficult on everything else, which is why it's usually disabled.

Although your GCC documentation may say that -fomit-frame-pointer is enabled at various optimization levels, you'll likely find that that's not the case—you'll almost certainly have to explicitly enable it yourself.

你又不是我 2024-10-25 13:57:48

打开 -fomit-frame-pointer源代码) 应该摆脱额外的堆栈操作。

GCC 显然保留了这些,因为它们有助于调试(在需要时获取堆栈跟踪),尽管文档指出 -fomit-frame-pointer 是从 GCC 4.6 开始的默认设置。

Turning on -fomit-frame-pointer (source) should get rid of the extra stack manipulations.

GCC apparently left those in because they facilitate debugging (getting a stack trace when needed), although the docs note that -fomit-frame-pointer is the default starting with GCC 4.6.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文