32 位到 64 位内联汇编移植

发布于 2024-08-26 09:15:22 字数 783 浏览 2 评论 0原文

我有一段 C++ 代码(在 GNU/Linux 环境下用 g++ 编译),它加载一个函数指针(它如何做并不重要),使用一些内联汇编将一些参数推送到堆栈上,然后调用该函数,代码如下:

unsigned long stack[] = { 1, 23, 33, 43 };

/* save all the registers and the stack pointer */
unsigned long esp;
asm __volatile__ ( "pusha" );
asm __volatile__ ( "mov %%esp, %0" :"=m" (esp));

for( i = 0; i < sizeof(stack); i++ ){
    unsigned long val = stack[i];
    asm __volatile__ ( "push %0" :: "m"(val) );
}

unsigned long ret = function_pointer();

/* restore registers and stack pointer */
asm __volatile__ ( "mov %0, %%esp" :: "m" (esp) );
asm __volatile__ ( "popa" );

我想添加某种

#ifdef _LP64
   // 64bit inline assembly
#else
   // 32bit version as above example
#endif

但我不知道64位机器的内联汇编,有人可以帮助我吗?

谢谢

I have a piece of C++ code (compiled with g++ under a GNU/Linux environment) that load a function pointer (how it does that doesn't matter), pushes some arguments onto the stack with some inline assembly and then calls that function, the code is like :

unsigned long stack[] = { 1, 23, 33, 43 };

/* save all the registers and the stack pointer */
unsigned long esp;
asm __volatile__ ( "pusha" );
asm __volatile__ ( "mov %%esp, %0" :"=m" (esp));

for( i = 0; i < sizeof(stack); i++ ){
    unsigned long val = stack[i];
    asm __volatile__ ( "push %0" :: "m"(val) );
}

unsigned long ret = function_pointer();

/* restore registers and stack pointer */
asm __volatile__ ( "mov %0, %%esp" :: "m" (esp) );
asm __volatile__ ( "popa" );

I'd like to add some sort of

#ifdef _LP64
   // 64bit inline assembly
#else
   // 32bit version as above example
#endif

But i don't know inline assembly for 64bit machines, anyone could help me?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

2024-09-02 09:15:22

虽然在内联汇编中使用适当的参数调用函数指针应该不是什么大问题,但我认为在 x64 中天真地重新编码不会对您有帮助,因为要使用的调用约定很可能不同(默认值对于 32 位和 64 位 linux 来说肯定是不同的)。请查看此处了解详细信息。所以我想,如果在这种情况下您可以摆脱内联汇编(请参阅其他答案),那么移植会更容易。

编辑:好的,我发现您可能必须使用汇编。这里有一些提示。

根据Agner Fog的文档,linux x64使用RDI、RSI、RDX、RCX、R8、R9和XMM0-XMM7进行参数传输。这意味着为了实现您想要的(不考虑浮点使用),您的函数必须:

(1)保存所有需要保存的寄存器(RBX、RBP、R12-R15):在堆栈上留出空间并将这些寄存器移到那里。这将类似于(英特尔语法):(

sub rsp, 0xSomeNumber1
mov [rsp+i*8], r# ; insert appropriate i for each register r# to be moved

2)评估必须通过堆栈传递到目标函数的参数数量。使用它在堆栈上预留所需的空间 (sub rsp, 0xSomeNumber2),同时考虑到 0xSomeNumber1,以便堆栈在末尾处 16 字节对齐,即 rsp 必须是 16 的倍数。在调用的函数返回之前,请勿修改 rsp

(3) 将函数参数加载到堆栈(如果需要)和用于参数传输的寄存器中。在我看来,如果从堆栈参数开始并最后加载寄存器参数是最简单的。

;loop over stack parameters - something like this
mov rax, qword ptr [AddrOfFirstStackParam + 8*NumberOfStackParam]
mov [rsp + OffsetToFirstStackParam + 8*NumberOfStackParam], rax

根据您设置例程的方式,第一个堆栈参数等的偏移量可能是不必要的。然后设置寄存器传递的参数数量(跳过不需要的参数):

mov r9, Param6
mov r8, Param5
mov rcx, Param4
mov rdx, Param3
mov rsi, Param2
mov rdi, Param1

(4) 使用与上面不同的寄存器调用目标函数:

call qword ptr [r#] ; assuming register r# contains the address of the target function

(5) 恢复保存的寄存器并恢复 rsp 为进入函数时的值。如有必要,请将被调用函数的返回值复制到您想要的任何位置。就这样。

注意:上面的草图没有考虑在 XMM 寄存器中传递的浮点值,但同样的原则适用。
免责声明:我在 Win64 上做过类似的事情,但从未在 Linux 上做过,所以可能有一些我忽略的细节。好好阅读,仔细编写代码并好好测试。

While it shouldn't be much of a problem to call a function pointer with the appropriate arguments in inline assembly, I don't think recoding this naively in x64 will help you, because the calling conventions to be used are very probably different (defaults for 32bit and 64bit linux are definitely different). Have a look here for details. So I guess, if you can get away without inline assembly in this case (see the other answer), it'll be easier to port.

Edit: OK, I see you may have to use assembly. Here are some pointers.

According to Agner Fog's document, linux x64 uses RDI, RSI, RDX, RCX, R8, R9 and XMM0-XMM7 for parameter transfer. This implies that in order to achieve what you want (disregarding floating-point use) your function will have to:

(1) save all registers that need to be saved (RBX, RBP, R12-R15): Set aside space on the stack and move these registers there. This will be somthing along the lines of (Intel syntax):

sub rsp, 0xSomeNumber1
mov [rsp+i*8], r# ; insert appropriate i for each register r# to be moved

(2) Evaluate the number of arguments you will have to pass by stack to the target function. Use this to set aside the required space on the stack (sub rsp, 0xSomeNumber2), taking into account 0xSomeNumber1 so that the stack will be 16-byte aligned at the end, i.e. rsp must be a multiple of 16. Don't modify rsp after this until your called function has returned.

(3) Load your function arguments on the stack (if necessary) and in the registers used for parameter transfer. In my view, it's easiest if you start with the stack parameters and load register parameters last.

;loop over stack parameters - something like this
mov rax, qword ptr [AddrOfFirstStackParam + 8*NumberOfStackParam]
mov [rsp + OffsetToFirstStackParam + 8*NumberOfStackParam], rax

Depending on how you set up your routine, the offset to the first stack parameter etc. may be unnceccessary. Then set up the number of register-passed arguments (skipping those you don't need):

mov r9, Param6
mov r8, Param5
mov rcx, Param4
mov rdx, Param3
mov rsi, Param2
mov rdi, Param1

(4) Call the target function using a different register from the above:

call qword ptr [r#] ; assuming register r# contains the address of the target function

(5) Restore the saved registers and restore rsp to the value it had on entry to your function. If necessary, copy the called function's return value wherever you want to have them. That's all.

Note: the above sketch does not take account of floating point values to be passed in XMM registers, but the same principles apply.
Disclaimer: I have done something similar on Win64, but never on Linux, so there may be some details I am overlooking. Read well, write your code carefully and test well.

世界如花海般美丽 2024-09-02 09:15:22

没有真正回答您的问题,但我认为您可以通过使用 以独立于平台的方式实现此目的setcontext(或makecontext)。

Not really answering your question, but I think you might be able to achieve this in a platform independent way by use of setcontext (or makecontext).

酒与心事 2024-09-02 09:15:22

主要问题:

  • pushad/popad 在 x64 上缺失,您必须推送要保存的各个寄存器
  • 您需要将 rsp(64 位堆栈指针)保存在合适的 64 位寄存器(rax?,r8?等)
  • 调用约定中几乎可以肯定从 32 位更改为 64 位

从 x86 到 x64 的更改摘要:

  • 以 E 开头的寄存器现在具有以 R 开头的 64 位等效项。RAX、RBX、RCX、RDX、RDI、RSI、RIP、RSP、RBP。
  • 新寄存器:R8 ... R15
  • 没有 Pushad,没有 Popad

我已将一些内联 x86 代码移植到 Windows 上的 x64。您绝对应该花一些时间阅读 x64 指令集以及操作系统的调用约定。 Windows 上的变化是彻底的,新的调用约定要简单得多。我怀疑 GNU/Linux 的变化也会有所不同,我绝对不会假设它是相同的。

我同意之前的答案,即如果您可以使用本地编码之外的替代方法,那就这样做。就我而言,我无法避免它。

Main problems:

  • pushad/popad is missing on x64, you have to push the individual registers you wish to save
  • you need to save your rsp (64 bit stack pointer) in a suitable 64 bit register (rax?, r8? etc)
  • calling convention has almost certainly changed from 32 bit to 64 bit

Summary of changes from x86 to x64:

  • Registers starting with E now have 64 bit equivalents starting with R. RAX, RBX, RCX, RDX, RDI, RSI, RIP, RSP, RBP.
  • New registers: R8 ... R15
  • No pushad, No popad

I've ported some inline x86 code to x64 on Windows. You should definitely take some time to read the x64 instruction set and also to read the calling convention for your operating system. The change on Windows was radical and the new calling convention a lot simpler. I suspect the GNU/Linux change will also be different, I would definitely not assume it is the same.

I would agree with a previous answer that if you can use an alternative method than coding natively, do so. In my case, I couldn't avoid it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文