为什么gcc输出的机器码有nop指令
每次我执行 objdump -d 时,我总是会看到带有批量 nop 指令(不执行任何操作的指令)的 asm 代码,
例如采用相同的程序: 例如
#include <stdio.h>
#include <math.h>
int main()
{
printf("Hello World!\n");
printf("cos: %f\n", cos(1));
return 1;
}
,objdump 在入口点末尾有 2 个 nop
0000000000400450 <_start>:
400450: 31 ed xor %ebp,%ebp
400452: 49 89 d1 mov %rdx,%r9
400455: 5e pop %rsi
400456: 48 89 e2 mov %rsp,%rdx
400459: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40045d: 50 push %rax
40045e: 54 push %rsp
40045f: 49 c7 c0 00 06 40 00 mov $0x400600,%r8
400466: 48 c7 c1 70 05 40 00 mov $0x400570,%rcx
40046d: 48 c7 c7 34 05 40 00 mov $0x400534,%rdi
400474: e8 bf ff ff ff callq 400438 <__libc_start_main@plt>
400479: f4 hlt
40047a: 90 nop
40047b: 90 nop
而这只是一个有很多例子,但你明白了。为什么C代码要这样编译?提前致谢。
Everytime I do an objdump -d I always see the asm code with batches of nop instructions (instructions that do nothing)
For example take this same program:
#include <stdio.h>
#include <math.h>
int main()
{
printf("Hello World!\n");
printf("cos: %f\n", cos(1));
return 1;
}
The objdump for exampe has 2 nops at the end of the entry point
0000000000400450 <_start>:
400450: 31 ed xor %ebp,%ebp
400452: 49 89 d1 mov %rdx,%r9
400455: 5e pop %rsi
400456: 48 89 e2 mov %rsp,%rdx
400459: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40045d: 50 push %rax
40045e: 54 push %rsp
40045f: 49 c7 c0 00 06 40 00 mov $0x400600,%r8
400466: 48 c7 c1 70 05 40 00 mov $0x400570,%rcx
40046d: 48 c7 c7 34 05 40 00 mov $0x400534,%rdi
400474: e8 bf ff ff ff callq 400438 <__libc_start_main@plt>
400479: f4 hlt
40047a: 90 nop
40047b: 90 nop
And that is just one of many examples but you get the idea. Why is the C code compiled this way? Thanks in Advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
添加 nop 是为了强制下一个函数与 4 字节边界对齐。 (请注意,最后一个
nop
之后的地址将为 40047c,可被 4 整除)The
nop
s are added to force the next function align to the 4-byte boundary. (notice that the address following the lastnop
will be 40047c which is divisible by 4)通常这些只是用于填充,以便后续内容再次从字或边界开始,因为访问未在字边界上对齐的任意代码对于 CPU 来说要昂贵得多。
Very often those are just used to do padding so that subsequent stuff starts on a word or boundary again, as access to arbitrary code that is not aligned on word boundaries is much more expensive for the cpu.