GCC函数填充值

发布于 2024-10-08 23:56:56 字数 314 浏览 2 评论 0原文

每当我编译启用优化的 C 或 C++ 代码时,GCC 都会将函数对齐到 16 字节边界(在 IA-32 上)。如果函数短于 16 个字节,GCC 会用一些字节填充它,这些字节似乎根本不是随机的:

19:   c3                      ret
1a:   8d b6 00 00 00 00       lea    0x0(%esi),%esi

它似乎总是 8d b6 00 00 00 00 ...8d 74 26 00

函数填充字节有什么意义吗?

Whenever I compile C or C++ code with optimizations enable,d GCC aligns functions to a 16-byte boundary (on IA-32). If the function is shorter than 16 bytes, GCC pads it with some bytes, which don't seem to be random at all:

19:   c3                      ret
1a:   8d b6 00 00 00 00       lea    0x0(%esi),%esi

It always seems to be either 8d b6 00 00 00 00 ... or 8d 74 26 00.

Do function padding bytes have any significance?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

以为你会在 2024-10-15 23:56:56

填充是由汇编器创建的,而不是由 gcc 创建的。它只看到一个 .align 指令(或等效指令),并且不知道要填充的空间是在函数内部(例如循环对齐)还是在函数之间,因此它必须插入 NOP某种类型的。现代 x86 汇编器使用尽可能多的 NOP 操作码,目的是在填充用于循环对齐时花费尽可能少的周期。

就我个人而言,我对对齐作为一种优化技术非常怀疑。我从未见过它有多大帮助,而且它肯定会因极大地增加总代码大小(和缓存利用率)而受到损害。如果您使用 -Os 优化级别,则默认情况下它处于关闭状态,因此无需担心。否则,您可以使用正确的 -f 选项禁用所有对齐。

The padding is created by the assembler, not by gcc. It merely sees a .align directive (or equivalent) and doesn't know whether the space to be padded is inside a function (e.g. loop alignment) or between functions, so it must insert NOPs of some sort. Modern x86 assemblers use the largest possible NOP opcodes with the intention of spending as few cycles as possible if the padding is for loop alignment.

Personally, I'm extremely skeptical of alignment as an optimization technique. I've never seen it help much, and it can definitely hurt by increasing the total code size (and cache utilization) tremendously. If you use the -Os optimization level, it's off by default, so there's nothing to worry about. Otherwise you can disable all the alignments with the proper -f options.

望她远 2024-10-15 23:56:56

汇编器首先看到 .align 指令。由于它不知道该地址是否在函数体内,因此无法输出 NULL 0x00 字节,并且必须生成 NOP (0x90代码>)。

但是:

lea    esi,[esi+0x0] ; does nothing, psuedocode: ESI = ESI + 0

执行的时钟周期少于

nop
nop
nop
nop
nop
nop

如果此代码碰巧落入函数体内(例如,循环对齐),则 lea 版本会快得多,同时仍然“不执行任何操作”。

The assembler first sees an .align directive. Since it doesn't know if this address is within a function body or not, it cannot output NULL 0x00 bytes, and must generate NOPs (0x90).

However:

lea    esi,[esi+0x0] ; does nothing, psuedocode: ESI = ESI + 0

executes in fewer clock cycles than

nop
nop
nop
nop
nop
nop

If this code happened to fall within a function body (for instance, loop alignment), the lea version would be much faster, while still "doing nothing."

奢望 2024-10-15 23:56:56

指令 lea 0x0(%esi),%esi 只是将 %esi 中的值加载到 %esi 中 - 它是无操作的(或 < code>NOP),这意味着如果执行它不会有任何效果。

这恰好是一条指令,6 字节 NOP。 8d 74 26 00 只是同一指令的 4 字节编码。

The instruction lea 0x0(%esi),%esi just loads the value in %esi into %esi - it's no-operation (or NOP), which means that if it's executed it will have no effect.

This just happens to be a single instruction, 6-byte NOP. 8d 74 26 00 is just a 4-byte encoding of the same instruction.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文