32 位到 64 位内联汇编移植

发布于 2024-08-26 09:15:22 字数 783 浏览 4 评论 0原文

我有一段 C++ 代码（在 GNU/Linux 环境下用 g++ 编译），它加载一个函数指针（它如何做并不重要），使用一些内联汇编将一些参数推送到堆栈上，然后调用该函数，代码如下：

unsigned long stack[] = { 1, 23, 33, 43 };

/* save all the registers and the stack pointer */
unsigned long esp;
asm __volatile__ ( "pusha" );
asm __volatile__ ( "mov %%esp, %0" :"=m" (esp));

for( i = 0; i < sizeof(stack); i++ ){
    unsigned long val = stack[i];
    asm __volatile__ ( "push %0" :: "m"(val) );
}

unsigned long ret = function_pointer();

/* restore registers and stack pointer */
asm __volatile__ ( "mov %0, %%esp" :: "m" (esp) );
asm __volatile__ ( "popa" );

我想添加某种

#ifdef _LP64
   // 64bit inline assembly
#else
   // 32bit version as above example
#endif

但我不知道64位机器的内联汇编，有人可以帮助我吗？

谢谢

原文

I have a piece of C++ code (compiled with g++ under a GNU/Linux environment) that load a function pointer (how it does that doesn't matter), pushes some arguments onto the stack with some inline assembly and then calls that function, the code is like :

unsigned long stack[] = { 1, 23, 33, 43 };

/* save all the registers and the stack pointer */
unsigned long esp;
asm __volatile__ ( "pusha" );
asm __volatile__ ( "mov %%esp, %0" :"=m" (esp));

for( i = 0; i < sizeof(stack); i++ ){
    unsigned long val = stack[i];
    asm __volatile__ ( "push %0" :: "m"(val) );
}

unsigned long ret = function_pointer();

/* restore registers and stack pointer */
asm __volatile__ ( "mov %0, %%esp" :: "m" (esp) );
asm __volatile__ ( "popa" );

I'd like to add some sort of

#ifdef _LP64
   // 64bit inline assembly
#else
   // 32bit version as above example
#endif

But i don't know inline assembly for 64bit machines, anyone could help me?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

颜 2024-09-02 09:15:22

虽然在内联汇编中使用适当的参数调用函数指针应该不是什么大问题，但我认为在 x64 中天真地重新编码不会对您有帮助，因为要使用的调用约定很可能不同（默认值对于 32 位和 64 位 linux 来说肯定是不同的）。请查看此处了解详细信息。所以我想，如果在这种情况下您可以摆脱内联汇编（请参阅其他答案），那么移植会更容易。

编辑：好的，我发现您可能必须使用汇编。这里有一些提示。

根据Agner Fog的文档，linux x64使用RDI、RSI、RDX、RCX、R8、R9和XMM0-XMM7进行参数传输。这意味着为了实现您想要的（不考虑浮点使用），您的函数必须：

（1）保存所有需要保存的寄存器（RBX、RBP、R12-R15）：在堆栈上留出空间并将这些寄存器移到那里。这将类似于（英特尔语法）：（

sub rsp, 0xSomeNumber1
mov [rsp+i*8], r# ; insert appropriate i for each register r# to be moved

2）评估必须通过堆栈传递到目标函数的参数数量。使用它在堆栈上预留所需的空间 (sub rsp, 0xSomeNumber2)，同时考虑到 0xSomeNumber1，以便堆栈在末尾处 16 字节对齐，即 rsp 必须是 16 的倍数。在调用的函数返回之前，请勿修改 rsp。

(3) 将函数参数加载到堆栈（如果需要）和用于参数传输的寄存器中。在我看来，如果从堆栈参数开始并最后加载寄存器参数是最简单的。

;loop over stack parameters - something like this
mov rax, qword ptr [AddrOfFirstStackParam + 8*NumberOfStackParam]
mov [rsp + OffsetToFirstStackParam + 8*NumberOfStackParam], rax

根据您设置例程的方式，第一个堆栈参数等的偏移量可能是不必要的。然后设置寄存器传递的参数数量（跳过不需要的参数）：

mov r9, Param6
mov r8, Param5
mov rcx, Param4
mov rdx, Param3
mov rsi, Param2
mov rdi, Param1

(4) 使用与上面不同的寄存器调用目标函数：

call qword ptr [r#] ; assuming register r# contains the address of the target function

(5) 恢复保存的寄存器并恢复 rsp 为进入函数时的值。如有必要，请将被调用函数的返回值复制到您想要的任何位置。就这样。

注意：上面的草图没有考虑在 XMM 寄存器中传递的浮点值，但同样的原则适用。
免责声明：我在 Win64 上做过类似的事情，但从未在 Linux 上做过，所以可能有一些我忽略的细节。好好阅读，仔细编写代码并好好测试。

While it shouldn't be much of a problem to call a function pointer with the appropriate arguments in inline assembly, I don't think recoding this naively in x64 will help you, because the calling conventions to be used are very probably different (defaults for 32bit and 64bit linux are definitely different). Have a look here for details. So I guess, if you can get away without inline assembly in this case (see the other answer), it'll be easier to port.

Edit: OK, I see you may have to use assembly. Here are some pointers.

According to Agner Fog's document, linux x64 uses RDI, RSI, RDX, RCX, R8, R9 and XMM0-XMM7 for parameter transfer. This implies that in order to achieve what you want (disregarding floating-point use) your function will have to:

(1) save all registers that need to be saved (RBX, RBP, R12-R15): Set aside space on the stack and move these registers there. This will be somthing along the lines of (Intel syntax):

sub rsp, 0xSomeNumber1
mov [rsp+i*8], r# ; insert appropriate i for each register r# to be moved

(2) Evaluate the number of arguments you will have to pass by stack to the target function. Use this to set aside the required space on the stack (sub rsp, 0xSomeNumber2), taking into account 0xSomeNumber1 so that the stack will be 16-byte aligned at the end, i.e. rsp must be a multiple of 16. Don't modify rsp after this until your called function has returned.

(3) Load your function arguments on the stack (if necessary) and in the registers used for parameter transfer. In my view, it's easiest if you start with the stack parameters and load register parameters last.

;loop over stack parameters - something like this
mov rax, qword ptr [AddrOfFirstStackParam + 8*NumberOfStackParam]
mov [rsp + OffsetToFirstStackParam + 8*NumberOfStackParam], rax

Depending on how you set up your routine, the offset to the first stack parameter etc. may be unnceccessary. Then set up the number of register-passed arguments (skipping those you don't need):

mov r9, Param6
mov r8, Param5
mov rcx, Param4
mov rdx, Param3
mov rsi, Param2
mov rdi, Param1

(4) Call the target function using a different register from the above:

call qword ptr [r#] ; assuming register r# contains the address of the target function

(5) Restore the saved registers and restore rsp to the value it had on entry to your function. If necessary, copy the called function's return value wherever you want to have them. That's all.

Note: the above sketch does not take account of floating point values to be passed in XMM registers, but the same principles apply.
Disclaimer: I have done something similar on Win64, but never on Linux, so there may be some details I am overlooking. Read well, write your code carefully and test well.

回复收藏 0 原文