GCC 4.3/4.4 与 MSC 6 在 i386 优化上的尺寸失败

发布于 2024-12-03 10:08:28 字数 2216 浏览 2 评论 0原文

我不确定我做错了什么,但我尝试阅读有关 GCC 调用约定的手册,但发现没有任何有用的东西。我当前的问题是 GCC 为一个非常简单的操作生成过大的代码,如下所示。

main.c:

#ifdef __GNUC__
    // defines for GCC
    typedef void (* push1)(unsigned long);
    #define PUSH1(P,A0)((push1)P)((unsigned long)A0)
#else
    // defines for MSC
    typedef void (__stdcall * push1)(unsigned long);
    #define PUSH1(P,A0)((push1)P)((unsigned long)A0)
#endif

int main() {
    // pointer to nasm-linked exit syscall "function".
    // will not work for win32 target, provided as an example.
    PUSH1(0x08048200,0x7F);
}

现在,让我们用 gcc 构建并转储它: gcc -c main.c -Os;objdump -d main.o

main.o:     file format elf32-i386

Disassembly of section .text:

00000000 <.text>:
   0:   8d 4c 24 04             lea    0x4(%esp),%ecx
   4:   83 e4 f0                and    $0xfffffff0,%esp
   7:   ff 71 fc                pushl  -0x4(%ecx)
   a:   b8 00 82 04 08          mov    $0x8048200,%eax
   f:   55                      push   %ebp
  10:   89 e5                   mov    %esp,%ebp
  12:   51                      push   %ecx
  13:   83 ec 10                sub    $0x10,%esp
  16:   6a 7f                   push   $0x7f
  18:   ff d0                   call   *%eax
  1a:   8b 4d fc                mov    -0x4(%ebp),%ecx
  1d:   83 c4 0c                add    $0xc,%esp
  20:   c9                      leave  
  21:   8d 61 fc                lea    -0x4(%ecx),%esp
  24:   c3                      ret

这是我能够得到的最小大小的代码...如果我不指定 -O* 或指定其他值,它将是 0x29 + 字节长。

现在,让我们使用 ms c 编译器 v 6(是的,98 iirc 之一)构建它: wine /mnt/ssd/msc/6/cl /c /TC main.c;wine /mnt/ssd/msc /6/dumpbin /disasm main.obj

Dump of file main.obj

File Type: COFF OBJECT

_main:
  00000000: 55                 push        ebp
  00000001: 8B EC              mov         ebp,esp
  00000003: 6A 7F              push        7Fh
  00000005: B8 00 82 04 08     mov         eax,8048200h
  0000000A: FF D0              call        eax
  0000000C: 5D                 pop         ebp
  0000000D: C3                 ret

如何让 GCC 生成类似的按大小代码?有什么提示、技巧吗?您不同意生成的代码应该这么小吗?为什么GCC附加这么多无用的代码?我认为在优化大小时它会比像 msc6 这样的老东西更聪明。我在这里缺少什么?

I am not sure what am I doing wrong, but I've tried reading manuals about calling conventions of GCC and found nothing useful there. My current problem is GCC generates excessively LARGE code for a very simple operation, like shown below.

main.c:

#ifdef __GNUC__
    // defines for GCC
    typedef void (* push1)(unsigned long);
    #define PUSH1(P,A0)((push1)P)((unsigned long)A0)
#else
    // defines for MSC
    typedef void (__stdcall * push1)(unsigned long);
    #define PUSH1(P,A0)((push1)P)((unsigned long)A0)
#endif

int main() {
    // pointer to nasm-linked exit syscall "function".
    // will not work for win32 target, provided as an example.
    PUSH1(0x08048200,0x7F);
}

Now, let's build and dump it with gcc: gcc -c main.c -Os;objdump -d main.o:

main.o:     file format elf32-i386

Disassembly of section .text:

00000000 <.text>:
   0:   8d 4c 24 04             lea    0x4(%esp),%ecx
   4:   83 e4 f0                and    $0xfffffff0,%esp
   7:   ff 71 fc                pushl  -0x4(%ecx)
   a:   b8 00 82 04 08          mov    $0x8048200,%eax
   f:   55                      push   %ebp
  10:   89 e5                   mov    %esp,%ebp
  12:   51                      push   %ecx
  13:   83 ec 10                sub    $0x10,%esp
  16:   6a 7f                   push   $0x7f
  18:   ff d0                   call   *%eax
  1a:   8b 4d fc                mov    -0x4(%ebp),%ecx
  1d:   83 c4 0c                add    $0xc,%esp
  20:   c9                      leave  
  21:   8d 61 fc                lea    -0x4(%ecx),%esp
  24:   c3                      ret

That's the minimum size code I am able to get... If I don't specify -O* or specify other values, it will be 0x29 + bytes long.

Now, let's build it with ms c compiler v 6 (yea, one of year 98 iirc): wine /mnt/ssd/msc/6/cl /c /TC main.c;wine /mnt/ssd/msc/6/dumpbin /disasm main.obj:

Dump of file main.obj

File Type: COFF OBJECT

_main:
  00000000: 55                 push        ebp
  00000001: 8B EC              mov         ebp,esp
  00000003: 6A 7F              push        7Fh
  00000005: B8 00 82 04 08     mov         eax,8048200h
  0000000A: FF D0              call        eax
  0000000C: 5D                 pop         ebp
  0000000D: C3                 ret

How do I make GCC generate the similar by size code? any hints, tips? Don't you agree resulting code should be small as that? Why does GCC append so much useless code? I thought it'd be smarter than such old stuff like msc6 when optimizing for size. What am I missing here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

深爱成瘾 2024-12-10 10:08:28

main() 在这里很特别:gcc 正在做一些额外的工作来使堆栈在程序的入口点处 16 字节对齐。因此结果的大小不能直接比较...尝试将 main() 重命名为 f(),您会看到 gcc 生成截然不同的代码。

(MSVC 编译的代码不需要关心对齐,因为 Windows 对于堆栈对齐有不同的规则。)

main() is special here: gcc is doing some extra work to make the stack 16-byte aligned at the entry point of the program. So the size of the result aren't directly comparable... try renaming main() to f() and you'll see gcc generates drastically different code.

(The MSVC-compiled code doesn't need to care about alignment because Windows has different rules for stack alignment.)

网名女生简单气质 2024-12-10 10:08:28

这是我能得到的最好的参考。我现在在 Windows 上,懒得登录 Linux 来测试。这里(MinGW GCC 4.5.2),代码比你的小。一个区别是调用约定,stdcall 当然比 cdecl 有几个字节的优势(如果未指定,则在 GCC 上默认,或者使用 -O1,我猜也使用 -Os)来清理堆栈。

这是我编译的方式和结果(源代码纯粹是从您的帖子中复制粘贴的)

gcc -S test.c:

_main:
    pushl   %ebp     #
    movl    %esp, %ebp   #,
    andl    $-16, %esp   #,
    subl    $16, %esp    #,
    call    ___main  #
    movl    $127, (%esp)     #,
    movl    $134513152, %eax     #, tmp59
    call    *%eax    # tmp59
    leave
    ret

gcc -c -o test.o test.c && objdump -d test.o:

test.o:     file format pe-i386


Disassembly of section .text:

00000000 <_main>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 e4 f0                and    $0xfffffff0,%esp
   6:   83 ec 10                sub    $0x10,%esp
   9:   e8 00 00 00 00          call   e <_main+0xe>
   e:   c7 04 24 7f 00 00 00    movl   $0x7f,(%esp)
  15:   b8 00 82 04 08          mov    $0x8048200,%eax
  1a:   ff d0                   call   *%eax
  1c:   c9                      leave
  1d:   c3                      ret
  1e:   90                      nop
  1f:   90                      nop

This is the best reference I can get. I'm on Windows now and too lazy to login to my Linux to test. Here (MinGW GCC 4.5.2), the code is smaller than yours. One difference is the calling convention, stdcall of course has a few bytes advantage over cdecl (default on GCC if not specified or with -O1 and I guess with -Os, too) to clean up the stack.

Here's the way I compile and the result (source code is purely copy pasted from your post)

gcc -S test.c:

_main:
    pushl   %ebp     #
    movl    %esp, %ebp   #,
    andl    $-16, %esp   #,
    subl    $16, %esp    #,
    call    ___main  #
    movl    $127, (%esp)     #,
    movl    $134513152, %eax     #, tmp59
    call    *%eax    # tmp59
    leave
    ret

gcc -c -o test.o test.c && objdump -d test.o:

test.o:     file format pe-i386


Disassembly of section .text:

00000000 <_main>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 e4 f0                and    $0xfffffff0,%esp
   6:   83 ec 10                sub    $0x10,%esp
   9:   e8 00 00 00 00          call   e <_main+0xe>
   e:   c7 04 24 7f 00 00 00    movl   $0x7f,(%esp)
  15:   b8 00 82 04 08          mov    $0x8048200,%eax
  1a:   ff d0                   call   *%eax
  1c:   c9                      leave
  1d:   c3                      ret
  1e:   90                      nop
  1f:   90                      nop
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文