当前位置：文江博客话题详情

x86_64 中运行时代码替换的绝对寻址

发布于 2024-12-13 12:22:15 字数 246 浏览 3 评论 0原文

我目前正在使用一些 32 位代码替换方案，其中移动到另一个位置的代码读取变量和类指针。由于 x86_64 不支持绝对寻址，我无法在代码的新位置获取变量的正确地址。详细问题是，由于 rip 相对寻址，指令指针地址与编译时不同。

那么有没有办法在 x86_64 中使用绝对寻址或其他方式来获取变量地址而不是指令指针相对地址？

诸如：leaq variable(%%rax), %%rbx 也会有所帮助。我只想不依赖指令指针。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

远山浅 2024-12-20 12:22:15

尝试使用 x86_64 的大型代码模型。在 gcc 中，可以使用 -mcmodel=large 来选择。编译器将对代码和数据使用 64 位绝对寻址。

您还可以添加 -fno-pic 以禁止生成与位置无关的代码。

编辑：我使用 -mcmodel=large 构建了一个小型测试应用程序，生成的二进制文件包含类似这样的序列

400b81:       48 b9 f0 30 60 00 00    movabs $0x6030f0,%rcx
400b88:       00 00 00 
400b8b:       49 b9 d0 09 40 00 00    movabs $0x4009d0,%r9
400b92:       00 00 00 
400b95:       48 8b 39                mov    (%rcx),%rdi
400b98:       41 ff d1                callq  *%r9

：加载绝对 64 位立即数（在本例中为地址），后跟间接调用或间接负载。该指令序列

moveabs $variable, %rbx
addq %rax, %rbx

相当于“leaq offset64bit(%rax), %rbx”（不存在），但有一些副作用，例如标志更改等。

Try using the large code model for x86_64. In gcc this can be selected with -mcmodel=large. The compiler will use 64 bit absolute addressing for both code and data.

You could also add -fno-pic to disallow the generation of position independent code.

Edit: I built a small test app with -mcmodel=large and the resulting binary contains sequences like

400b81:       48 b9 f0 30 60 00 00    movabs $0x6030f0,%rcx
400b88:       00 00 00 
400b8b:       49 b9 d0 09 40 00 00    movabs $0x4009d0,%r9
400b92:       00 00 00 
400b95:       48 8b 39                mov    (%rcx),%rdi
400b98:       41 ff d1                callq  *%r9

which is a load of an absolute 64 bit immediate (in this case an address) followed by an indirect call or an indirect load. The instruction sequence

moveabs $variable, %rbx
addq %rax, %rbx

is the equivalent to a "leaq offset64bit(%rax), %rbx" (which doesn't exist), with some side effects like flag changing etc.

回复收藏 0 原文

倒带 2024-12-20 12:22:15

你问的问题是可行的，但不是很容易。

一种方法是补偿指令中的代码移动。您需要找到所有使用 RIP 相对寻址的指令（它们的 ModRM 字节为 05h、0dh、15h、1dh、25h、2dh、35h 或 3dh）并调整其 disp32 字段的移动量（因此移动被限制在虚拟地址空间中的 +/- 2GB，考虑到 64 位地址，这可能无法保证）空间大于4GB）。

您还可以将这些指令替换为等效指令，很可能将每条原始指令替换为多个指令，例如：

; These replace the original instruction and occupy exactly as many bytes as the original instruction:
  JMP Equivalent1
  NOP
  NOP
Equivalent1End:

; This is the code equivalent to the original instruction:
Equivalent1:
  Equivalent subinstruction 1
  Equivalent subinstruction 2
  ...
  JMP Equivalent1End

两种方法都需要至少一些基本的 x86 反汇编例程。

前者可能需要在 Windows 上使用 VirtualAlloc()（或 Linux 上的某些等效函数），以确保包含原始代码修补副本的内存在原始代码的 +/- 2GB 范围内。并且特定地址的分配仍然可能失败。

后者不仅需要原始反汇编，还需要完整的指令解码和生成。

可能还有其他怪癖需要解决。

还可以通过设置 RFLAGS 寄存器中的 TF 标志来发现指令边界，使 CPU 在结束时生成单步调试中断每条指令的执行。调试异常处理程序需要捕获这些异常并记录下一条指令的 RIP 值。我相信这可以在 Windows 中使用结构化异常处理 (SEH) 来完成（从未尝试过调试中断），不确定 Linux 是否如此。为此，您必须执行所有代码、每条指令。

顺便说一句，64 位模式下有绝对寻址，例如，参见操作码从 0A0h 到 0A3h 的 MOV 到/从累加器指令。

What you're asking about is doable, but not very easy.

One way to do it is compensate for the code move in its instructions. You need to find all the instructions that use the RIP-relative addressing (they have the ModRM byte of 05h, 0dh, 15h, 1dh, 25h, 2dh, 35h or 3dh) and adjust their disp32 field by the amount of move (the move is therefore limited to +/- 2GB in the virtual address space, which may not be guaranteed given the 64-bit address space is bigger than 4GB).

You can also replace those instructions with their equivalents, most likely replacing every original instruction with more than one, for example:

; These replace the original instruction and occupy exactly as many bytes as the original instruction:
  JMP Equivalent1
  NOP
  NOP
Equivalent1End:

; This is the code equivalent to the original instruction:
Equivalent1:
  Equivalent subinstruction 1
  Equivalent subinstruction 2
  ...
  JMP Equivalent1End

Both methods will require at least some rudimentary x86 disassembly routines.

The former may require the use of VirtualAlloc() on Windows (or some equivalent on Linux) to ensure the memory that contains the patched copy of the original code is within +/- 2GB of that original code. And allocation at specific addresses can still fail.

The latter will require more than just primitive disassemblying, but also full instruction decoding and generation.

There may be other quirks to work around.

Instruction boundaries may also be found by setting the TF flag in the RFLAGS register to make the CPU generate the single-step debug interrupt at the end of execution of every instruction. A debug exception handler will need to catch those and record the value of RIP of the next instruction. I believe this can be done using Structured Exception Handling (SEH) in Windows (never tried with the debug interrupts), not sure about Linux. For this to work you'll have to make all of the code execute, every instruction.

Btw, there's absolute addressing in 64-bit mode, see, for example the MOV to/from accumulator instructions with opcodes from 0A0h through 0A3h.

回复收藏 0 原文

~没有更多了~