为什么指令指针不是可与 MOV 或 ADD 一起使用的普通寄存器?
关于 x86 汇编的维基百科文章说“程序员不能直接访问 IP 寄存器。 ”
直接意味着使用 mov
和 add
等指令,就像我们读写 EAX 一样。
为什么不呢?这背后的原因是什么?有哪些技术限制?
有一些特殊的指令,例如 jmp
来设置它,以及 call
在设置新值之前推送旧值。 (在 x86-64 中,使用 RIP 相对寻址模式通过 LEA 进行读取。)请参阅直接读取程序计数器 了解详情。
The Wikipedia article about x86 assembly says that "the IP register cannot be accessed by the programmer directly."
Directly means with instructions like mov
and add
, the same way we can read and write EAX.
Why not? What is the reason behind this? What are the technical restrictions?
There are special instructions like jmp
to set it, and call
to push the old value before setting a new one. (And in x86-64, read with LEA using a RIP-relative addressing mode.) See Reading program counter directly for details.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您无法直接访问它,因为没有合法的用例。任意指令更改
eip
都会使分支预测变得非常困难,并且可能会引发一系列安全问题。您可以使用
jmp
、call
或ret
编辑eip
。您只是无法使用正常操作直接读取或写入eip
,将
eip
设置到寄存器就像jmp eax
一样简单。您还可以执行push eax; ret,它将eax
的值压入堆栈,然后返回(即弹出和跳转)。第三个选项是call eax
,它调用eax 中的地址。阅读可以这样进行:
You can't access it directly because there's no legitimate use case. Having any arbitrary instruction change
eip
would make branch prediction very difficult, and would probably open up a whole host of security issues.You can edit
eip
usingjmp
,call
orret
. You just can't directly read from or write toeip
using normal operationsSetting
eip
to a register is as simple asjmp eax
. You can also dopush eax; ret
, which pushes the value ofeax
to the stack and then returns (i.e. pops and jumps). The third option iscall eax
which does a call to the address in eax.Reading can be done like this:
这对于 x86 来说是一种可能的设计。 ARM 公开其程序计数器以供读取/写为R15。不过,这很不寻常。
它允许非常紧凑的函数序言/结尾,以及使用单个指令推送或弹出多个寄存器的能力:在条目上推送 {r5, lr} 和弹出 {r5, pc } 返回。 (将链接寄存器保存的值弹出到程序计数器中)。
然而,它使得高性能/乱序 ARM 实现变得不太方便,并且在 AArch64 中被放弃。
所以这是可能的,但会耗尽其中一个寄存器。 32位ARM有16个整数寄存器(包括PC),因此一个寄存器号需要4位来编码为ARM机器码。另一个寄存器几乎总是与堆栈指针绑定在一起,因此 ARM 有 14 个通用整数寄存器。 (LR可以保存到堆栈中,因此它可以并且被用作函数体内的通用寄存器)。
现代x86大部分继承自8086。它的设计采用相当紧凑的变长指令编码,并且只有8个寄存器,机器码中的每个src和dst寄存器只需要3位。
在最初的 8086 中,它们不是非常通用,并且 SP 相对寻址在 16 位模式下是不可能的,因此本质上有 2 个寄存器(SP 和 BP)与堆栈内容相关。这样就只剩下 6 个有点通用的寄存器,而其中一个是 PC 而不是通用寄存器将大大减少可用寄存器,从而大大增加典型代码中的溢出/重新加载量。
AMD64添加了r8-r15和RIP相对寻址模式。用于直接访问静态数据和常量的 lea rsi、[rip+whatever] 和 RIP 相对寻址模式是高效的位置无关代码所需要的。间接 JMP 指令完全足以写入 RIP。
允许使用任意指令来读取或写入 PC 并没有真正获得任何好处,因为您始终可以使用整数寄存器和间接跳转来执行相同的操作。 x86-64 的 R15 与 RIP 相同几乎是纯粹的缺点,特别是对于作为编译器目标的架构性能而言。 (到 2000 年,当 AMD64 被设计出来时,手写的 asm 奇怪的东西已经是一个非常不常见的利基事物了。)
所以 AMD64 确实是 x86 第一次有理由获得像 ARM 这样的完全公开的程序计数器,但是有很多不这样做的充分理由。
That would have been a possible design for x86. ARM does expose its program counter for read/write as R15. That's unusual, though.
It allows a very compact function prologue/epilogue, along with the ability to push or pop multiple registers with a single instruction:
push {r5, lr}
on entry, andpop {r5, pc}
to return. (Popping the saved value of the link register into the program counter).However, it makes high-perf / out-of-order ARM implementations less convenient, and was dropped for AArch64.
So it's possible, but uses up one of the registers. 32-bit ARM has 16 integer registers (including PC), so a register number takes 4 bits to encode in ARM machine code. Another register is almost always tied up as the stack pointer, so ARM has 14 general-purpose integer registers. (LR can be saved to the stack, so it can be and is used as a general-purpose register inside function bodies).
Most of modern x86 is inherited from 8086. It was designed with fairly compact variable-length instruction encoding, and only 8 registers, requiring only 3 bits for each src and dst register in the machine code.
In the original 8086, they were not very general-purpose, and SP-relative addressing isn't possible in 16-bit mode, so essentially 2 registers (SP and BP) are tied up for stack stuff. This leaves only 6 somewhat-general purpose registers, and having one of them be the PC instead of general-purpose would be a huge reduction in available registers, greatly increasing the amount of spill/reload in typical code.
AMD64 added r8-r15, and the RIP-relative addressing mode.
lea rsi, [rip+whatever]
, and RIP-relative addressing modes for direct access to static data and constants, is all you need for efficient position-independent code. Indirect JMP instructions are totally sufficient for writing to RIP.There isn't really anything to be gained by allowing arbitrary instructions to be used to read or write the PC, since you can always do the same thing with an integer register and an indirect jump. It would be almost pure downside for x86-64's R15 to be the same thing as RIP, especially for the architecture's performance as a compiler target. (Hand-written asm weird stuff was already very much an uncommon niche thing by 2000, when AMD64 was designed.)
So AMD64 is really the first time that x86 could plausibly have gained a fully-exposed program counter like ARM, but there were many good reasons not to do that.
我认为他们的意思是IP寄存器不能像访问其他寄存器一样直接访问。程序员绝对可以写入IP,例如通过发出跳转指令。
I think they meant that the IP register cannot be accessed directly in the same way the other registers are accessed. Programmers can definitely write to IP, for example by issuing a jump instruction.
jmp
将设置EIP
寄存器。此代码会将 eip 设置为 00401000:
并获取
EIP
jmp
will set theEIP
register.this code will set eip to 00401000:
and for getting
EIP