没有 %gs 寄存器的 x86 子集:使用 %gs 而不是捕获模拟的二进制修补代码?

发布于 2024-11-27 13:20:07 字数 202 浏览 4 评论 0原文

由于过于复杂的原因无法在这里解释,我需要在 x86 子集的平台上运行 x86 GCC 编译的 Linux 程序。该平台没有%gs寄存器, 这意味着它必须被模拟,因为 GCC 依赖于 %gs 寄存器的存在。

目前我有一个包装器,当程序尝试访问 %gs 寄存器时捕获异常并模拟它。但这太慢了。有没有一种方法可以让我用等效的指令提前修补 ELF 中的操作码,从而避免陷阱和模拟?

For reasons too complicated to explain here, I have the need to run a x86 GCC-compiled Linux program on a platform that is a subset of x86. This platform does not have the %gs register,
which means it has to be emulated, because GCC relies on the presence of the %gs register.

Currently I have a wrapper which catches the exceptions when the program attempts to access the %gs register, and emulates it. But this is dog slow. Is there a way that I can patch the opcodes in the ELF ahead of time with equivalent instructions, so that the trap-and-emulate is avoided?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

乖乖哒 2024-12-04 13:20:07
   -mtls-direct-seg-refs
   -mno-tls-direct-seg-refs
       Controls whether TLS variables may be accessed with offsets from
       the TLS segment register (%gs for 32-bit, %fs for 64-bit), or
       whether the thread base pointer must be added.  Whether or not this
       is legal depends on the operating system, and whether it maps the
       segment to cover the entire TLS area.

       For systems that use GNU libc, the default is on.

Have you tried compiling your code with the -mno-tls-direct-seg-refs option? From my GCC man page (i686-apple-darwin10-gcc-4.2.1):

   -mtls-direct-seg-refs
   -mno-tls-direct-seg-refs
       Controls whether TLS variables may be accessed with offsets from
       the TLS segment register (%gs for 32-bit, %fs for 64-bit), or
       whether the thread base pointer must be added.  Whether or not this
       is legal depends on the operating system, and whether it maps the
       segment to cover the entire TLS area.

       For systems that use GNU libc, the default is on.
So尛奶瓶 2024-12-04 13:20:07

continue

(This is assuming Adam Rosenfields solution is not applicable. It, or a similar approach, is probably a better way to solve it.)

You haven't stated how you're emulating the %gs register, but it's probably going to be tough to patch every usage in general unless you have some special knowledge about the program, because otherwise you only have 2 bytes (in the worst, common case) you can modify with your patch. Of course, if you're using something like %es = %gs it should be relatively straight forward.

Assuming this can somehow be made to work in your case the strategy is to scan the executable sections of the ELF-file and patch any instruction that uses or modifies the GS register. That is at least the following instructions:

  • Any instruction with the GS segment override prefix (65 expect for branch instructions in which case the prefix indicates something else)
  • push gs (0F A8)
  • pop gs (0F A9)
  • mov r/m16, gs (8C /r)
  • mov gs, r/m16 (8E /r)
  • mov gs, r/m64 (REX.W 8E /r) (If you support 64-bit mode)

And any others instructions that allow segment registers (I don't think that are that many more, but I'm not 100% sure).

This is all comming from Intel® 64 and IA-32 Architectures Software Developer's Manual Combined Volumes 2A and 2B: Instruction Set Reference, A-Z. Be aware that the instructions are sometimes prefixed with other prefixes, sometimes not, so you should probably use a library to do the instruction decoding rather than blindly searching for byte sequences.

Some of the above instructions should be relatively straight forward to turn into call my_patch or similar, but you're probably going to have trouble finding something that fits in two bytes and works in general. int XX (CD XX) might be a good candidate if you can setup an interrupt vector, but I'm not sure it's gonna be faster than the method you're currently using. You will of course need to record which instruction was patched out and have the interrupt handler (or whatever) react differently depending on the return address (that your handler receives).

You might be able to setup a trampoline if you can find room within -128..127 bytes and use JMP rel8 (EB cb) to jump to the trampoline (usually another JMP, but this time with more room for the target address), which then handles the instruction emulation and jumps back to the instruction following the patched out %gs usage.

Lastly I'd recommend keeping the trap-and-emulate code running to catch any cases you might not have thought off (self-modifying or injected code for instance). This way you can also log any unhandled cases and add them to your solution.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文