GCC 内联汇编器,混合寄存器大小 (x86)
有谁知道如何摆脱以下汇编器警告?
代码是 x86,32 位:
int test (int x)
{
int y;
// do a bit-rotate by 8 on the lower word. leave upper word intact.
asm ("rorw $8, %0\n\t": "=q"(y) :"0"(x));
return y;
}
如果我编译它,我会收到以下(非常有效)警告:
Warning: using `%ax' instead of `%eax' due to `w' suffix
我正在寻找的是一种告诉编译器/汇编器我想要访问 % 的低 16 位子寄存器的方法0。 访问字节子寄存器(在本例中为 AL 和 AH)也很高兴知道。
我已经选择了“q”修饰符,因此编译器被迫使用 EAX、EBX、ECX 或 EDX。 我已经确保编译器必须选择一个具有子寄存器的寄存器。
我知道我可以强制 asm 代码使用特定的寄存器(及其子寄存器),但我想将寄存器分配工作留给编译器。
Does anyone know how I can get rid of the following assembler warning?
Code is x86, 32 bit:
int test (int x)
{
int y;
// do a bit-rotate by 8 on the lower word. leave upper word intact.
asm ("rorw $8, %0\n\t": "=q"(y) :"0"(x));
return y;
}
If I compile it I get the following (very valid) warning:
Warning: using `%ax' instead of `%eax' due to `w' suffix
What I'm looking for is a way to tell the compiler/assembler that I want to access the lower 16 bit sub-register of %0. Accessing the byte sub-registers (in this case AL and AH) would be nice to know as well.
I've already chosen the "q" modifier, so the compiler is forced to use EAX, EBX, ECX or EDX. I've made sure the compiler has to pick a register that has sub-registers.
I know that I can force the asm-code to use a specific register (and its sub-registers), but I want to leave the register-allocation job up to the compiler.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果我没记错的话,你可以使用
%w0
。 我刚刚也测试过。 :-)编辑:作为对OP的回应,是的,您也可以执行以下操作:
对于x86,它记录在 x86 操作数修饰符部分 手册扩展汇编部分。
对于非 x86 指令集,您可能需要深入研究 GCC 源代码中的
.md
文件。 例如,在正式记录之前,gcc/config/i386/i386.md
是唯一可以找到它的地方。(相关:在 GNU C 内联汇编中,单个操作数的 xmm/ymm/zmm 的大小覆盖修饰符是什么? 对于向量寄存器。)
You can use
%w0
if I remember right. I just tested it, too. :-)Edit: In response to the OP, yes, you can do the following too:
For x86 it's documented in the x86 Operand Modifiers section of the Extended Asm part of the manual.
For non-x86 instruction sets, you may have to dig through their
.md
files in the GCC source. For example,gcc/config/i386/i386.md
was the only place to find this before it was officially documented.(Related: In GNU C inline asm, what are the size-override modifiers for xmm/ymm/zmm for a single operand? for vector registers.)
很久以前,但我可能需要这个作为我自己未来的参考......
添加到克里斯的好答案说,关键是在“%”和输出操作数的数量之间使用修饰符。 例如,
“MOV %1, %0”
可能会变为“MOV %q1, %w0”
。我在constraints.md中找不到任何内容,但是 / gcc/config/i386/i386.c 在
print_reg()
的源代码中有这个可能有用的注释:下面的
ix86_print_operand()
注释提供了一个示例:GCC 内部文档的“noreferrer">输出模板:
“
%c2
”构造允许使用偏移量正确格式化 LEA 指令:请注意“
%c1
”中关键但记录稀疏的“c”。 该宏相当于但不使用通常的整数算术执行端口。
编辑添加:
在 x64 中进行直接调用可能很困难,因为它需要另一个未记录的修饰符:“
%P0
”(似乎用于 PIC)小写的“p”修饰符似乎也可以GCC 中的功能相同,但 ICC 只识别大写“P”。 更多详细信息可能位于 /gcc/config/i386/ i386.c。 搜索“'p'”。
Long ago, but I'll likely need this for my own future reference...
Adding on to Chris's fine answer says, the key is using a modifier between the '%' and the number of the output operand. For example,
"MOV %1, %0"
might become"MOV %q1, %w0"
.I couldn't find anything in constraints.md, but /gcc/config/i386/i386.c had this potentially useful comment in the source for
print_reg()
:A comment below for
ix86_print_operand()
offer an example:A few more useful options are listed under Output Template of the GCC Internals documentation:
The '
%c2
' construct allows one to properly format an LEA instruction using an offset:Note the crucial but sparsely documented 'c' in '
%c1
'. This macro is equivalent tobut without making use of the usual integer arithmetic execution ports.
Edit to add:
Making direct calls in x64 can be difficult, as it requires yet another undocumented modifier: '
%P0
' (which seems to be for PIC)A lower case 'p' modifier also seems to function the same in GCC, although only the capital 'P' is recognized by ICC. More details are probably available at /gcc/config/i386/i386.c. Search for "'p'".
当我在考虑这个问题时...您应该在 Chris 的第二个解决方案中将“q”约束替换为大写“Q”约束:
“q”和“Q”在 64 位模式下略有不同,您可以在其中获得所有整数寄存器(ax、bx、cx、dx、si、di、sp、bp、r8-r15)的最低字节。 但你只能得到四个原始386寄存器(ax、bx、cx、dx)的第二低字节(例如ah)。
While I'm thinking about it ... you should replace the "q" constraint with a capital "Q" constraint in Chris's second solution:
"q" and "Q" are slightly different in 64-bit mode, where you can get the lowest byte for all of the integer registers (ax, bx, cx, dx, si, di, sp, bp, r8-r15). But you can only get the second-lowest byte (e.g. ah) for the four original 386 registers (ax, bx, cx, dx).
显然有一些技巧可以做到这一点......但它可能不是那么有效。 32 位 x86 处理器在操作通用寄存器中的 16 位数据时通常慢。 如果性能很重要,您应该对其进行基准测试。
除非这是 (a) 性能关键并且 (b) 被证明更快,否则我会为自己省去一些维护麻烦,只需在 C 中完成:
使用 GCC 4.2 和 -O2,这可以优化到六个指令...
So apparently there are tricks to do this... but it may not be so efficient. 32-bit x86 processors are generally slow at manipulating 16-bit data in general purpose registers. You ought to benchmark it if performance is important.
Unless this is (a) performance critical and (b) proves to be much faster, I would save myself some maintenance hassle and just do it in C:
With GCC 4.2 and -O2 this gets optimized down to six instructions...