如何在 GCC x86 内联汇编中使用地址常量
GCC 工具链默认使用 AT&T 汇编器语法,但可通过 .intel_syntax
指令提供对 Intel 语法的支持。
此外,AT&T 和 Intel 语法均提供 prefix
和 noprefix
版本,其不同之处在于是否需要在寄存器名称前添加 %
印记。
根据存在的指令,地址常量的格式会发生变化。
让我们考虑以下 C 代码,
*(int *)0xdeadbeef = 0x1234;
使用 objdump -d
,我们发现它被编译为以下汇编指令。
movl $0x1234,0xdeadbeef
由于不涉及寄存器,因此这是 .att_syntax 前缀
的正确语法。 code> 和 .att_syntax noprefix
,即。 嵌入在 C 代码中,它们看起来像这样
__asm__(".att_syntax prefix");
__asm__("movl $0x1234,0xdeadbeef");
__asm__(".att_syntax noprefix");
__asm__("movl $0x1234,0xdeadbeef");
您可以选择用括号将地址常量括起来,即。
__asm__("movl $0x1234,(0xdeadbeef)");
也会起作用。
当向普通地址常量添加印记时,代码将无法编译
__asm__("movl $0x1234,$0xdeadbeef"); // won't compile
当用括号括住此表达式时,编译器将在没有警告的情况下发出错误的代码,即
__asm__("movl $0x1234,($0xdeadbeef)"); // doesn't warn, but doesn't work!
这将错误地发出指令
movl $0x1234,0x0
在 Intel 模式下,地址常量必须添加前缀带有段寄存器以及操作数大小和 PTR 标志(如果可能存在歧义)。 在我的机器(装有 Windows XP 和当前 MinGW 和 Cygwin GCC 版本的 Intel 双核笔记本电脑)上,默认情况下使用寄存器 ds
。
常量两边的方括号是可选的。 如果省略段寄存器,但存在括号,则也可以正确识别地址常量。 不过,省略寄存器会在我的系统上发出警告。
在 prefix
模式下,段寄存器必须以 %
为前缀,但仅使用括号仍然有效。 这些是生成正确指令的不同方法:
__asm__(".intel_syntax noprefix");
__asm__("mov DWORD PTR ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!
__asm__(".intel_syntax prefix");
__asm__("mov DWORD PTR %ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR %ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!
省略段寄存器和括号将无法编译
__asm__("mov DWORD PTR 0xdeadbeef,0x1234"); // won't compile
我将这个问题标记为社区维基,所以如果您有任何有用的内容需要添加,请随时添加这样做。
The GCC toolchain uses AT&T assembler syntax by default, but support for Intel syntax is available via the .intel_syntax
directive.
Additionally, both AT&T and Intel syntax are available in a prefix
and a noprefix
version, which differ in whether or not they require to prefix register names with a %
sigil.
Depending on which directives are present, the format for address constants changes.
Let's consider the following C code
*(int *)0xdeadbeef = 0x1234;
Using objdump -d
, we find that it's compiled to the following assembler instruction
movl $0x1234,0xdeadbeef
As there are no registers involved, this is the correct syntax for both .att_syntax prefix
and .att_syntax noprefix
, ie. embedded in C code, they look like this
__asm__(".att_syntax prefix");
__asm__("movl $0x1234,0xdeadbeef");
__asm__(".att_syntax noprefix");
__asm__("movl $0x1234,0xdeadbeef");
You can optionally surround the address constant with parentheses, ie.
__asm__("movl $0x1234,(0xdeadbeef)");
will work as well.
When adding a sigil to a plain address constant, the code will fail to copile
__asm__("movl $0x1234,$0xdeadbeef"); // won't compile
When surrounding this expression with paranthesis, the compiler will emit wrong code without warning, ie
__asm__("movl $0x1234,($0xdeadbeef)"); // doesn't warn, but doesn't work!
This will incorrectly emit the instruction
movl $0x1234,0x0
In Intel mode, an address constant has to be prefixed with a segment register as well as the operand size and the PTR
flag if ambiguity is possible. On my machine (an Intel dual core laptop with Windows XP and current MinGW and Cygwin GCC versions), the register ds
is used by default.
Square brackets around the constant are optional. The address constant is also correctly recognized if the segment register is omitted, but the brackets are present. Omitting the register emits a warning on my system, though.
In prefix
mode, the segment register has to be prefixed with %
, but only using brackets will still work. These are the different ways to generate the correct instruction:
__asm__(".intel_syntax noprefix");
__asm__("mov DWORD PTR ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!
__asm__(".intel_syntax prefix");
__asm__("mov DWORD PTR %ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR %ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!
Omitting both segment register and brackets will fail to compile
__asm__("mov DWORD PTR 0xdeadbeef,0x1234"); // won't compile
I'll mark this question as community wiki, so if you have anything useful to add, feel free to do so.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
noprefix
/prefix
指令仅控制寄存器是否需要%
前缀(*)(至少看起来是这样,这是文档的唯一区别提到)。 在 AT&T 语法中,值文字始终需要$
前缀,而在 Intel 语法中则不需要。 因此,以下内容有效:如果您确实倾向于在使用 GCC 编译并使用 GAS 汇编的 C 代码中使用 Intel 语法内联汇编,请不要忘记在其后面添加以下内容,以便汇编器可以理解其余部分(AT& ;T 语法)由 GCC 生成的程序集:
我看到的前缀/无前缀区别的原因是,对于 AT&T 语法,Intel 架构上的寄存器实际上并不需要
%
前缀,因为寄存器被命名。 但为了统一起见,它可以存在,因为一些其他体系结构(即 SPARC)已经对编号进行了注册,在这种情况下,单独指定一个低编号会导致内存地址或寄存器的含义不明确。The
noprefix
/prefix
directives only control whether registers require a%
prefix(*) (at least it seems so and that's the only difference the documentation mentions). Value literals always need a$
prefix in AT&T syntax and never in Intel syntax. So the following works:If you are really inclined to use Intel syntax inline assembly within C code compiled with GCC and assembled with GAS, do not forget to also add the following after it, so that the assembler can grok the rest of the (AT&T syntax) assembly generated by GCC:
The reasoning I see for the prefix/noprefix distinction is, that for AT&T syntax, the
%
prefix is not really needed for registers on Intel architecture, because registers are named. But for uniformity it can be there because some other architectures (i.e. SPARC) have numbered registered, in which case specifying a low number alone would be ambiguous as to whether a memory address or register was meant.这是我自己的结果:
Here are my own results: