如何保存 x86_64 上的寄存器用于中断服务程序?
我正在查看学校项目中的一些旧代码,在尝试在笔记本电脑上编译它时遇到了一些问题。它最初是为旧的 32 位版本的 gcc 编写的。不管怎样,我试图将一些程序集转换为 64 位兼容代码,但遇到了一些障碍。
以下是原始代码:
pusha
pushl %ds
pushl %es
pushl %fs
pushl %gs
pushl %ss
pusha
在 64 位模式下无效。那么在 64 位模式下,在 x86_64 汇编中执行此操作的正确方法是什么?
pusha
在 64 位模式下无效肯定是有原因的,所以我感觉手动推送所有寄存器可能不是一个好主意。
I am looking at some old code from a school project, and in trying to compile it on my laptop I ran into some problems. It was originally written for an old 32 bit version of gcc. Anyway I was trying to convert some of the assembly over to 64 bit compatible code and hit a few snags.
Here is the original code:
pusha
pushl %ds
pushl %es
pushl %fs
pushl %gs
pushl %ss
pusha
is not valid in 64 bit mode. So what would be the proper way to do this in x86_64 assembly while in 64 bit mode?
There has got to be a reason why pusha
is not valid in 64 bit mode, so I have a feeling manually pushing all the registers may not be a good idea.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
pusha
在 64 位模式下无效,因为它是多余的。单独推送每个寄存器正是要做的事情。pusha
is not valid in 64-bit mode because it is redundant. Pushing each register individually is exactly the thing to do.您好,这可能不是正确的方法,但可以创建类似
和
的宏,并最终添加其他 r8-15 寄存器(如果需要)
Hi it might not be the correct way to do it but one can create macros like
and
and eventually add the other r8-15 registers if one needs to
以我今天测试的一个简短程序为例,我想做同样的事情,在开始执行我们刚刚了解到的系统调用之前备份所有寄存器。因此,我首先尝试了 Pusha 和 Popa,这是我在旧的 IA-32 英特尔架构软件开发人员手册中找到的东西。然而它没有起作用。我已经手动测试过这一点,但它确实有效。
这是使用 x64 编译时的结果:
没有 Pusha 和 Popa 助记符,它可以工作,这就是结果:
这适用于 x32 模式:
英特尔参考文档
但是,如果您想尝试此方法,它确实可以手动工作:
结果将be:
长话短说,您可以手动将寄存器一一加载到堆栈中,然后弹出它们以在需要时恢复它。
Take for example a short program where I was testing this today I wanted to do the same thing back up all the registers before I started to do syscalls that we just learned about. So I first attempted pusha and popa something I found in a old IA-32 Intel Architecture Software Developer's Manual. However it did not work. I have tested this manually and that works however.
This is the result when it is compiled with x64:
Without the pusha and popa mnemonics it works and this is the result:
This will work for x32 mode:
Intel Ref Document
However it does work manually if you wanted to try this method:
And the result would be:
Long story short, you can just load the registers manually one by one into the stack and pop them after to restore it if so required.
AMD 在开发 64 位 x86 扩展时需要一些空间来为
REX
前缀和其他一些新指令添加新操作码。他们将一些操作码的含义更改为这些新指令。其中一些说明只是现有说明的简短形式,或者是不必要的。
PUSHA
是受害者之一。目前尚不清楚他们为什么禁止 PUSHA,但它似乎没有与任何新指令操作码重叠。也许它们保留了 PUSHA 和 POPA 操作码以供将来使用,因为它们完全是多余的,不会更快,并且在代码中不会频繁出现,因此不重要。 。PUSHA
的顺序是指令编码的顺序:eax
、ecx
、edx
、ebx
、esp
、ebp
、esi
、edi
。请注意,它冗余地推送了esp
!您需要知道esp
才能找到它推送的数据!如果您要从 64 位转换代码,
PUSHA
代码无论如何都不好,您需要更新它以将新寄存器r8
推送到r15
。您还需要保存和恢复更大的 SSE 状态,从xmm8
到xmm15
。假设你要打败他们。如果中断处理程序代码只是转发到 C 代码的存根,则无需保存所有寄存器。您可以假设 C 编译器将生成保留
rbx
、rbp
、rsi
、rdi
、以及r12
到r15
。您只需通过r11< 保存和恢复
rax
、rcx
、rdx
和r8
/代码>。 (注意:在 Linux 或其他 System V ABI 平台上,编译器将保留rbx
、rbp
、r12
-r15
,您可以期待rsi
和rdi
被破坏)。段寄存器在长模式下不保留任何值(如果被中断的线程在 32 位兼容模式下运行,则必须保留段寄存器,感谢 ughoavgfhw)。实际上,他们在长模式下摆脱了大部分分段,但 FS 仍然保留给操作系统用作线程本地数据的基地址。寄存器值本身并不重要,
FS
和GS
的基数是通过MSR0xC0000100
和0xC0000101
设置的。假设您不会使用 FS,则无需担心,只需记住 C 代码访问的任何线程本地数据都可以使用任何随机线程的 TLS。请注意这一点,因为 C 运行时库使用 TLS 来实现某些功能(例如:strtok 通常使用 TLS)。将值加载到
FS
或GS
(即使在用户模式下)将覆盖FSBASE
或GSBASE
MSR。由于某些操作系统使用 GS 作为“处理器本地”存储(它们需要一种方法来为每个 CPU 提供指向结构的指针),因此它们需要将其保存在不会因加载而被破坏的地方GS
在用户模式下。为了解决这个问题,为GSBASE
寄存器保留了两个MSR:一个是活动的,一个是隐藏的。在内核模式下,内核的GSBASE
保存在通常的GSBASE
MSR 中,而用户模式基础则保存在另一个(隐藏的)GSBASE
MSR 中。当上下文从内核模式切换到用户模式上下文时,以及保存用户模式上下文并进入内核模式时,上下文切换代码必须执行 SWAPGS 指令,该指令交换可见和隐藏 GSBASE 的值> MSR。由于内核的GSBASE
在用户模式下安全地隐藏在其他 MSR 中,因此用户模式代码无法通过将值加载到GS< 来破坏内核的
GSBASE
/代码>。当CPU重新进入内核模式时,上下文保存代码将执行SWAPGS并恢复内核的GSBASE。AMD needed some room to add new opcodes for
REX
prefixes and some other new instructions when they developed the 64-bit x86 extensions. They changed the meaning of some of the opcodes to those new instructions.Several of the instructions were simply short-forms of existing instructions or were otherwise not necessary.
PUSHA
was one of the victims. It's not clear why they bannedPUSHA
though, it doesn't seem to overlap any new instruction opcodes. Perhaps they are reserved thePUSHA
andPOPA
opcodes for future use, since they are completely redundant and won't be any faster and won't occur frequently enough in code to matter.The order of
PUSHA
was the order of the instruction encoding:eax
,ecx
,edx
,ebx
,esp
,ebp
,esi
,edi
. Note that it redundantly pushedesp
! You need to knowesp
to find the data it pushed!If you are converting code from 64-bit the
PUSHA
code is no good anyway, you need to update it to push the new registersr8
thrur15
. You also need to save and restore a much larger SSE state,xmm8
thruxmm15
. Assuming you are going to clobber them.If the interrupt handler code is simply a stub that forwards to C code, you don't need to save all of the registers. You can assume that the C compiler will generate code that will be preserving
rbx
,rbp
,rsi
,rdi
, andr12
thrur15
. You should only need to save and restorerax
,rcx
,rdx
, andr8
thrur11
. (Note: on Linux or other System V ABI platforms, the compiler will be preservingrbx
,rbp
,r12
-r15
, you can expectrsi
andrdi
clobbered).The segment registers hold no value in long mode (if the interrupted thread is running in 32-bit compatibility mode you must preserve the segment registers, thanks ughoavgfhw). Actually, they got rid of most of the segmentation in long mode, but
FS
is still reserved for operating systems to use as a base address for thread local data. The register value itself doesn't matter, the base ofFS
andGS
are set through MSRs0xC0000100
and0xC0000101
. Assuming you won't be usingFS
you don't need to worry about it, just remember that any thread local data accessed by the C code could be using any random thread's TLS. Be careful of that because C runtime libraries use TLS for some functionality (example: strtok typically uses TLS).Loading a value into
FS
orGS
(even in user mode) will overwrite theFSBASE
orGSBASE
MSR. Since some operating systems useGS
as "processor local" storage (they need a way to have a pointer to a structure for each CPU), they need to keep it somewhere that won't get clobbered by loadingGS
in user mode. To solve this problem, there are two MSRs reserved for theGSBASE
register: one active one and one hidden one. In kernel mode, the kernel'sGSBASE
is held in the usualGSBASE
MSR and the user mode base is in the other (hidden)GSBASE
MSR. When context switching from kernel mode to a user mode context, and when saving a user mode context and entering kernel mode, the context switch code must execute the SWAPGS instruction, which swaps the values of the visible and hiddenGSBASE
MSR. Since the kernel'sGSBASE
is safely hidden in the other MSR in user mode, the user mode code can't clobber the kernel'sGSBASE
by loading a value intoGS
. When the CPU reenters kernel mode, the context save code will executeSWAPGS
and restore the kernel'sGSBASE
.从执行此类操作的现有代码中学习。例如:
SAVE_ARGS_IRQ
):entry_64.SINTR_PUSH
):privregs.hIDT_VEC
): 异常.S(类似于vector.S 在 NetBSD 中)事实上, “手动推送”regs 是 AMD64 上的唯一方法,因为那里不存在 PUSHA。 AMD64 在这方面并不是独一无二的 - 大多数非 x86 CPU 在某些时候也需要逐个寄存器保存/恢复。
但如果仔细检查引用的源代码,您会发现并非所有中断处理程序都需要保存/恢复整个寄存器集,因此还有优化的空间。
Learn from existing code that does this kind of thing. For example:
SAVE_ARGS_IRQ
): entry_64.SINTR_PUSH
): privregs.hIDT_VEC
): exception.S (similar is vector.S in NetBSD)In fact, "manually pushing" the regs is the only way on AMD64 since
PUSHA
doesn't exist there. AMD64 isn't unique in this aspect - most non-x86 CPUs do require register-by-register saves/restores as well at some point.But if you inspect the referenced sourcecode closely you'll find that not all interrupt handlers require to save/restore the entire register set, so there is room for optimizations.