cltq 在汇编中做什么?
0x0000000000400553 <main+59>: mov -0x4(%rbp),%eax
0x0000000000400556 <main+62>: cltq
0x0000000000400558 <main+64>: shl $0x3,%rax
0x000000000040055c <main+68>: mov %rax,%rdx
事实上,我的程序很简单:
5 int main(int argc, char *argv[]) {
6 int i = 0;
7 while(environ[i]) {
8 printf("%s\n", environ[i++]);
9 }
10 return 0;
但是汇编输出很长:
Dump of assembler code for function main:
0x0000000000400518 <main+0>: push %rbp
0x0000000000400519 <main+1>: mov %rsp,%rbp
0x000000000040051c <main+4>: sub $0x20,%rsp
0x0000000000400520 <main+8>: mov %edi,-0x14(%rbp)
0x0000000000400523 <main+11>: mov %rsi,-0x20(%rbp)
0x0000000000400527 <main+15>: movl $0x0,-0x4(%rbp)
0x000000000040052e <main+22>: jmp 0x400553 <main+59>
0x0000000000400530 <main+24>: mov -0x4(%rbp),%eax
0x0000000000400533 <main+27>: cltq
0x0000000000400535 <main+29>: shl $0x3,%rax
0x0000000000400539 <main+33>: mov %rax,%rdx
0x000000000040053c <main+36>: mov 0x2003e5(%rip),%rax # 0x600928 <environ@@GLIBC_2.2.5>
0x0000000000400543 <main+43>: lea (%rdx,%rax,1),%rax
0x0000000000400547 <main+47>: mov (%rax),%rdi
0x000000000040054a <main+50>: addl $0x1,-0x4(%rbp)
0x000000000040054e <main+54>: callq 0x400418 <puts@plt>
0x0000000000400553 <main+59>: mov -0x4(%rbp),%eax
0x0000000000400556 <main+62>: cltq
0x0000000000400558 <main+64>: shl $0x3,%rax
0x000000000040055c <main+68>: mov %rax,%rdx
0x000000000040055f <main+71>: mov 0x2003c2(%rip),%rax # 0x600928 <environ@@GLIBC_2.2.5>
0x0000000000400566 <main+78>: lea (%rdx,%rax,1),%rax
0x000000000040056a <main+82>: mov (%rax),%rax
0x000000000040056d <main+85>: test %rax,%rax
0x0000000000400570 <main+88>: jne 0x400530 <main+24>
0x0000000000400572 <main+90>: mov $0x0,%eax
0x0000000000400577 <main+95>: leaveq
0x0000000000400578 <main+96>: retq
End of assembler dump.
我不明白的是这个块:
0x000000000040052e <main+22>: jmp 0x400553 <main+59>
0x0000000000400530 <main+24>: mov -0x4(%rbp),%eax
0x0000000000400533 <main+27>: cltq
0x0000000000400535 <main+29>: shl $0x3,%rax
0x0000000000400539 <main+33>: mov %rax,%rdx
0x000000000040053c <main+36>: mov 0x2003e5(%rip),%rax # 0x600928 <environ@@GLIBC_2.2.5>
0x0000000000400543 <main+43>: lea (%rdx,%rax,1),%rax
0x0000000000400547 <main+47>: mov (%rax),%rdi
0x000000000040054a <main+50>: addl $0x1,-0x4(%rbp)
0x000000000040054e <main+54>: callq 0x400418 <puts@plt>
0x0000000000400553 <main+59>: mov -0x4(%rbp),%eax
0x0000000000400556 <main+62>: cltq
0x0000000000400558 <main+64>: shl $0x3,%rax
0x000000000040055c <main+68>: mov %rax,%rdx
0x000000000040055f <main+71>: mov 0x2003c2(%rip),%rax # 0x600928 <environ@@GLIBC_2.2.5>
0x0000000000400566 <main+78>: lea (%rdx,%rax,1),%rax
0x000000000040056a <main+82>: mov (%rax),%rax
0x000000000040056d <main+85>: test %rax,%rax
0x0000000000400570 <main+88>: jne 0x400530 <main+24>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
助记符
cltq
是 Intel 的cdqe
的gas
助记符,记录如下:https://sourceware.org/binutils/docs/as/i386_002dMnemonics.html助记符为:
cltq
): AT&T-stylecdqe
):Intel术语:
这是 GAS 名称与 Intel 版本有很大不同的少数指令之一。
as
接受助记符,但像 NASM 这样的 Intel 语法汇编程序可能只接受 Intel 名称。作用
它的符号将4个字节扩展到8个字节,在2的补码中意味着对于:
在C中,通常表示从有符号
int
到long
的转换。示例:
该指令仅适用于 64 位。
另请考虑以下说明:
CWDE
(AT&TCWTL
)、CBW
(AT&TCBTW
):CDQE
的较小版本,也出现在 32 位CQO
系列中,该符号将RAX
扩展为RDX:RAX
MOVSX
系列,它们都进行符号扩展和移动:movsbl 指令做什么?GitHub 上带断言的最小可运行示例:
CWDE
和CWTL
CDQE
和CLTQ
C 示例
GCC 4.9.3 发出it:
编译和反汇编:
包含:
并且行为是:
Mnemonic
cltq
is thegas
mnemonic for Intel'scdqe
as documented at: https://sourceware.org/binutils/docs/as/i386_002dMnemonics.htmlThe mnemonics are:
cltq
): AT&T-stylecdqe
): IntelTerminology:
This is one of the few instructions whose GAS name is very different from the Intel version.
as
accepts either mnemonic, but Intel-syntax assemblers like NASM may only accept the Intel names.Effect
It sign extends 4 bytes into 8 bytes, which in 2's complement means that for:
In C, that usually represents a cast from signed
int
tolong
.Example:
This instruction is only available on 64-bits.
Also consider the following instructions:
CWDE
(AT&TCWTL
),CBW
(AT&TCBTW
): smaller versions ofCDQE
, also present in 32-bitCQO
family, which sign extendsRAX
intoRDX:RAX
MOVSX
family, which both sign extends and moves: what does movsbl instruction do?Minimal runnable examples on GitHub with assertions:
CWDE
andCWTL
CDQE
andCLTQ
C example
GCC 4.9.3 emits it:
Compile and disassemble:
contains:
and the behavior is:
cltq 将 int 提升为 int64。 shl 3, %rax 对 64 位指针进行偏移(将 rax 中的内容乘以 8)。该代码正在做的是循环遍历指向环境变量的指针列表。当它找到零值时,即结束,并退出循环。
下面是 Linux 如何将环境变量存储在堆栈上方 RAM 中的示意图。您将看到从 0xbffff75c 开始的指针;指向 0xbffff893,“TERM=rxvt”。
您的编译器显然足够聪明,可以将简单格式的
printf
优化为puts
。环境字符串的获取和 i 的后增量都在代码中。如果你不自己弄清楚其中的一些内容,你永远不会真正理解它。只需“成为”计算机,并使用我通过 gdb 为您转储的数据单步执行循环,您就会明白一切。cltq promotes an int to an int64. shl 3, %rax makes an offset to a 64-bit pointer (multiplies whatever is in rax by 8). what the code is doing is looping through a list of pointers to environment variables. when it finds a value of zero, that's the end, and it drops out of the loop.
Here is a visual on how Linux stores the environment variables in RAM, above the stack. You'll see the pointers starting at 0xbffff75c; that points to 0xbffff893, "TERM=rxvt".
Your compiler is apparently smart enough to optimize the simply-formatted
printf
to aputs
. the fetching of the environment string, and the postincrement of i, are right there in the code. If you don't figure some of this out on your own you'll never really understand it. Just "be" the computer, and step through the loop, using the data I dumped out for you with gdb, and it should all become clear to you.cltq
是 CDQE 的 AT&T 助记符,它将 EAX 符号扩展为 RAX。它是movslq %eax, %rax
的缩写形式,节省代码字节。它的存在是因为 x86-64 从 8086 到 386 再到 AMD64 的演变。它将 EAX 的符号位复制到更宽寄存器的所有高位,因为这就是 2 的补码的工作原理。该助记符是 Convert Long to Quad 的缩写。
AT&T 语法(由 GNU
as
/objdump
使用)对于某些指令使用与 Intel 不同的助记符(请参阅 官方文档)。您可以使用 objdump -drwC -Mintel 或 gcc -masm=intel -S 使用 Intel 和 AMD 在其指令参考手册中记录的助记符来获取 Intel 语法(请参阅 x86 标签 wiki(有趣的事实:作为输入,gas 在任一模式下都接受任一助记符)。title="show questions tagged 'x86'" rel= " .com/x86/CBW:CWDE:CDQE.html" rel="nofollow noreferrer">这 3 个 insn 的英特尔 insn 参考手册条目。
cltq
/cdqe
显然仅在 64 位模式下可用,但其他两个在所有模式下均可用。 x86/MOVSX:MOVSXD.html" rel="nofollow noreferrer">movsx
和movzx
仅在 386 中引入,使其变得简单/高效对al
/ax
以外的寄存器进行符号/零扩展,或者在加载时动态进行符号/零扩展。将
cltq
/cdqe
视为movslq %eax,%rax
的特殊情况较短编码。它运行得同样快。但唯一的好处是节省了几个字节的代码,因此不值得牺牲任何其他东西来使用它来代替movsxd
/movzx
。相关指令组将 [e/r]ax 的符号位复制到 [e/r]dx 的所有位中。 将
eax
符号扩展为edx:eax
在idiv
之前或在返回宽整数之前很有用一对寄存器。这些没有等效的单指令,但您可以用两条指令来完成它们:
例如
mov %eax, %edx
/sar $31, %edx
记住助记符
用于在
rax
除原始的 8086cbw
外,均以e
结尾。您可以记住这种情况,因为即使 8086 也可以在单个寄存器中处理 16 位整数,因此无需将dl
设置为al
的符号位。div r8
和idiv r8
从ax
读取被除数,而不是从dl:al
读取。因此,cbw
将al
符号扩展为ax
。AT&T 助记符没有明显的提示来帮助您记住哪个是哪个。一些写入
*dx
的内容以d
结尾(代表 dx?),而不是通常的l
代表long.
cqto
打破了这种模式,但八字是 128b,因此必须是rdx:rax
的串联。IMO 英特尔助记符更容易记住,英特尔语法一般更容易阅读。 (我首先学习了 AT&T 语法,但后来习惯了 Intel,因为阅读 Intel/AMD 手册很有用!)
请注意,对于零扩展,
mov %edi,%edi
将%edi
零扩展为%rdi
,因为 任何写入 32-位寄存器将高 32 位清零。(实际上,尝试
mov
到不同的寄存器(例如mov %eax, %ecx
),因为same,same
击败英特尔中的 mov-elimination CPU。您经常会看到编译器为具有 32 位无符号参数的函数生成的 asm 使用mov
进行零扩展,不幸的是通常使用与 src 和目标相同的寄存器。)对于 8 或 16 到 32(隐含的 64),
和 $0xff, %eax
可以工作,但效率低于movzbl %al, %eax
。$0xff
不适合 8 位符号扩展立即数,因此它需要完整的 4 字节0x000000ff
立即数。 (或者更好的是,movzbl %al, %ecx
,这样 mov-elimination 可以使其在 Intel CPU 上实现零延迟,其中 mov-elimination 适用于movzx
8->32。) 。cltq
is the AT&T mnemonic for CDQE, which sign-extends EAX into RAX. It's a short-form ofmovslq %eax, %rax
, saving code bytes. It exists because of how x86-64 evolved from 8086 to 386 to AMD64.It copies the sign bit of EAX to all the upper bits of the wider register, because that's how 2's complement works. The mnemonic is short for Convert Long to Quad.
AT&T syntax (used by GNU
as
/objdump
) uses different mnemonics than Intel for some instructions (see the official docs). You can useobjdump -drwC -Mintel
orgcc -masm=intel -S
to get Intel syntax using the mnemonics that Intel and AMD document in their instruction reference manuals (see links in the x86 tag wiki. (Fun fact: as input, gas accepts either mnemonic in either mode).Intel insn ref manual entry for these 3 insns.
cltq
/cdqe
is obviously only available in 64-bit mode, but the other two are available in all modes.movsx
andmovzx
were only introduced with 386, making it easy/efficient to sign/zero extend registers other thanal
/ax
, or to sign/zero extend on the fly while loading.Think of
cltq
/cdqe
as a special-case shorter encoding ofmovslq %eax,%rax
. It runs just as fast. But the only benefit is saving a couple bytes of code, so it's not worth sacrificing anything else to use it instead ofmovsxd
/movzx
.A related group of instructions copies the sign-bit of [e/r]ax into all bits of [e/r]dx. Sign-extending
eax
intoedx:eax
is useful beforeidiv
, or simply before returning a wide integer in a pair of registers.These have no single-instruction equivalent, but you can do them in two instructions:
e.g.
mov %eax, %edx
/sar $31, %edx
Remembering the mnemonics
The Intel mnemonics for Extending within
rax
all end withe
, except for the original 8086cbw
. You can remember that case because even 8086 handled 16-bit integers in a single register, so there'd be no need to setdl
to the sign bit ofal
.div r8
andidiv r8
read the dividend fromax
, not fromdl:al
. Socbw
sign-extendsal
intoax
.The AT&T mnemonics don't have an obvious hint to help you remember which one is which. Some of the ones that write to
*dx
end withd
(for dx?) instead of the usuall
forlong
.cqto
breaks that pattern, but an octword is 128b and thus has to be the concatenation ofrdx:rax
.IMO the Intel mnemonics are easier to remember, and Intel-syntax is easier to read in general. (I learned AT&T syntax first, but got used to Intel because reading Intel/AMD manuals is useful!)
Note that for zero-extension,
mov %edi,%edi
zero-extends%edi
into%rdi
, because any write to a 32-bit register zeros the upper 32 bits.(In practice, try to
mov
to a different register (e.g.mov %eax, %ecx
) becausesame,same
defeats mov-elimination in Intel CPUs. You will often see compiler-generated asm for functions with 32-bit unsigned args use amov
to zero-extend, and unfortunately often with the same register as src and destination.)For 8 or 16 out to 32 (and implicitly 64),
and $0xff, %eax
works but is less efficient thanmovzbl %al, %eax
.$0xff
doesn't fit in an 8-bit sign-extended immediate so it needs a full 4-byte0x000000ff
immediate. (Or better,movzbl %al, %ecx
so mov-elimination can make it zero latency on Intel CPUs where mov-elimination works formovzx
8->32.).如果你的操作系统是64位,如果你没有声明一个函数驻留在另一个文件中,但你想在这个文件中使用它。 GCC会默认认为这个函数是32位的。所以cltq只会使用RAX(返回值)的低32位,高32位将填充1或0。
希望这个网站能帮助你
http://www.mystone7.com/2012/05/23/cltq/
If your OS is 64bit, If you do not declare a function reside in another file, but you want to use it in this file. GCC will default to think this function to be 32bit. So cltq will only use low 32 bit of RAX(return value) , the high 32bit will be fill in 1 or 0.
hope this web will help you
http://www.mystone7.com/2012/05/23/cltq/