x86_64 中运行时代码替换的绝对寻址
我目前正在使用一些 32 位代码替换方案,其中移动到另一个位置的代码读取变量和类指针。由于 x86_64 不支持绝对寻址,我无法在代码的新位置获取变量的正确地址。详细问题是,由于 rip 相对寻址,指令指针地址与编译时不同。
那么有没有办法在 x86_64 中使用绝对寻址或其他方式来获取变量地址而不是指令指针相对地址?
诸如:leaq variable(%%rax), %%rbx
也会有所帮助。我只想不依赖指令指针。
I'm currently using some code replace scheme in 32 bit where the code which is moved to another position, reads variables and a class pointer. Since x86_64 does not support absolute addressing I have trouble getting the correct addresses for the variables at the new position of the code. The problem in detail is, that because of rip relative addressing the instruction pointer address is different than at compile time.
So is there a way to use absolute addressing in x86_64 or another way to get addresses of variables not instruction pointer relative?
Something like: leaq variable(%%rax), %%rbx
would also help. I only want to have no dependency on the instruction pointer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试使用 x86_64 的大型代码模型。在 gcc 中,可以使用 -mcmodel=large 来选择。编译器将对代码和数据使用 64 位绝对寻址。
您还可以添加 -fno-pic 以禁止生成与位置无关的代码。
编辑:我使用 -mcmodel=large 构建了一个小型测试应用程序,生成的二进制文件包含类似这样的序列
:加载绝对 64 位立即数(在本例中为地址),后跟间接调用或间接负载。该指令序列
相当于“leaq offset64bit(%rax), %rbx”(不存在),但有一些副作用,例如标志更改等。
Try using the large code model for x86_64. In gcc this can be selected with -mcmodel=large. The compiler will use 64 bit absolute addressing for both code and data.
You could also add -fno-pic to disallow the generation of position independent code.
Edit: I built a small test app with -mcmodel=large and the resulting binary contains sequences like
which is a load of an absolute 64 bit immediate (in this case an address) followed by an indirect call or an indirect load. The instruction sequence
is the equivalent to a "leaq offset64bit(%rax), %rbx" (which doesn't exist), with some side effects like flag changing etc.
你问的问题是可行的,但不是很容易。
一种方法是补偿指令中的代码移动。您需要找到所有使用 RIP 相对寻址的指令(它们的
ModRM
字节为 05h、0dh、15h、1dh、25h、2dh、35h 或 3dh)并调整其disp32
字段的移动量(因此移动被限制在虚拟地址空间中的 +/- 2GB,考虑到 64 位地址,这可能无法保证)空间大于4GB)。您还可以将这些指令替换为等效指令,很可能将每条原始指令替换为多个指令,例如:
两种方法都需要至少一些基本的 x86 反汇编例程。
前者可能需要在 Windows 上使用 VirtualAlloc()(或 Linux 上的某些等效函数),以确保包含原始代码修补副本的内存在原始代码的 +/- 2GB 范围内。并且特定地址的分配仍然可能失败。
后者不仅需要原始反汇编,还需要完整的指令解码和生成。
可能还有其他怪癖需要解决。
还可以通过设置 RFLAGS 寄存器中的 TF 标志来发现指令边界,使 CPU 在结束时生成单步调试中断每条指令的执行。调试异常处理程序需要捕获这些异常并记录下一条指令的 RIP 值。我相信这可以在 Windows 中使用结构化异常处理 (SEH) 来完成(从未尝试过调试中断),不确定 Linux 是否如此。为此,您必须执行所有代码、每条指令。
顺便说一句,64 位模式下有绝对寻址,例如,参见操作码从 0A0h 到 0A3h 的 MOV 到/从累加器指令。
What you're asking about is doable, but not very easy.
One way to do it is compensate for the code move in its instructions. You need to find all the instructions that use the RIP-relative addressing (they have the
ModRM
byte of 05h, 0dh, 15h, 1dh, 25h, 2dh, 35h or 3dh) and adjust theirdisp32
field by the amount of move (the move is therefore limited to +/- 2GB in the virtual address space, which may not be guaranteed given the 64-bit address space is bigger than 4GB).You can also replace those instructions with their equivalents, most likely replacing every original instruction with more than one, for example:
Both methods will require at least some rudimentary x86 disassembly routines.
The former may require the use of
VirtualAlloc()
on Windows (or some equivalent on Linux) to ensure the memory that contains the patched copy of the original code is within +/- 2GB of that original code. And allocation at specific addresses can still fail.The latter will require more than just primitive disassemblying, but also full instruction decoding and generation.
There may be other quirks to work around.
Instruction boundaries may also be found by setting the
TF
flag in theRFLAGS
register to make the CPU generate thesingle-step
debug interrupt at the end of execution of every instruction. A debug exception handler will need to catch those and record the value of RIP of the next instruction. I believe this can be done usingStructured Exception Handling (SEH)
in Windows (never tried with the debug interrupts), not sure about Linux. For this to work you'll have to make all of the code execute, every instruction.Btw, there's absolute addressing in 64-bit mode, see, for example the MOV to/from accumulator instructions with opcodes from 0A0h through 0A3h.