拦截64位Linux内核函数:32/64位函数指针的长度?
我正在尝试重新实现旧的巨兽内核拦截(描述于 此 Phrack 问题)。
替换32位函数调用的代码如下:
#define SYSMAPADDR 0x12345678
#define CODESIZE 7
static char acct_code[7] = "\xb8\x00\x00\x00\x00"/*movl $0, %eax*/
"\xff\xe0";/*jmp *%eax*/
*(long*)&acct_code[1] = (long)my_hijacking_function;
// here, use either set_pages_rw or trick CR0 to do this:
memcpy(SYSMAPADDR, acct_code, CODESIZE);
但原始函数的64位地址是0xffffffff12345678(内核位于低内存)。
那么(long)新函数指针是否只适合movl指令的4个\x00字节呢?
顺便说一句,请将其链接到 我可以替换吗带有模块的 Linux 内核函数? 和 使用模块重写功能Linux内核,上面介绍的hacky方法更加灵活(可以拦截非extern函数=>无需重新编译内核)。
I'm trying to re-implement old-as-behemoth kernel intercept (described at this Phrack issue).
The code to replace 32-bit function call is like:
#define SYSMAPADDR 0x12345678
#define CODESIZE 7
static char acct_code[7] = "\xb8\x00\x00\x00\x00"/*movl $0, %eax*/
"\xff\xe0";/*jmp *%eax*/
*(long*)&acct_code[1] = (long)my_hijacking_function;
// here, use either set_pages_rw or trick CR0 to do this:
memcpy(SYSMAPADDR, acct_code, CODESIZE);
But 64-bit address of original function is 0xffffffff12345678 (kernel is located in low-memory).
So will the (long) new function pointer fit just 4 \x00 bytes of the movl instruction?
Btw, please link this to Can I replace a Linux kernel function with a module? and Overriding functionality with modules in Linux kernel, the hacky method described above is more flexible (can intercept non-extern functions => no need to recompile the kernel).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在任何 x86(32 或 64 位)上,无法直接无条件跳转到位移大于 2GB 的地址。
当我不久前编写一个绕行库时,我能想到的重定向程序流(针对 x86-64)的最佳选项包括将目标函数的序言备份
M
字节并覆盖目标函数的序言。序言有两个说明。我使用 %r11 寄存器而不是累加器。根据 AMD64 ABI 草案 0.99.5,%r11 是一个临时寄存器,不会在函数调用之间保留。
第一条指令 movq $addr, %r11 的作用与它看起来的完全一样:它将指定的地址加载到寄存器中。第二条指令
jmp *%r11
强制无条件间接跳转到 %r11 中存储的地址。附加到备份指令的末尾应该是另一个无条件间接跳转回原始目标函数的地址,即紧接在被覆盖指令之后的地址。然后,当您想调用原始函数时,可以调用备份函数序言的地址,程序流程照常继续。
请记住,要备份的字节数
M
必须是存储/跳转指令的大小与覆盖指令的其余部分的大小之和。完成此巫术后,您不想留下任何部分说明。There is no way to make a direct and unconditional jump to an address with a displacement greater than 2GB on any x86 (32 or 64 bit).
When I wrote a detouring library some time ago, the best options I could come up with to redirect program flow (for x86-64) involved backing up the target function's prologue by
M
bytes and overwriting the target function's prologue with two instructions.I use the %r11 register instead of the accumulator. According to the AMD64 ABI Draft 0.99.5, %r11 is a temporary register that is not preserved across function calls.
The first instruction,
movq $addr, %r11
, does exactly what it looks like: it loads the specified address into a register. The second instruction,jmp *%r11
, forces an unconditional indirect jump to the address stored in %r11.Appended to the end of the backed-up instructions should be another unconditional indirect jump back to the original target's function, to an address immediately after the overwritten instructions. Then, when you want to call to original, you can invoke the address of the backed up function prologue and program flow continues as usual.
Remember that the number of bytes to backup,
M
, must be the sum of the size of the store/jump instructions and the remainder of the overwritten instruction. You don't want to leave any partial instructions behind after doing this voodoo.注意:我假设这是针对 x86_64 的。
函数指针是 64 位,并且
movl
指令零扩展到 64 位寄存器,因此您必须重写机器代码。你想要的指令可能是48 B8 (imm64)
(即movq ..., %rax
),我认为跳转指令可以独自一人,但我对此了解不多。您可能应该将“x86-64”和“Assembly”标签添加到您的问题中。Note: I am assuming this is for x86_64.
Function pointers are 64 bits, and the
movl
instruction zero-extends into 64-bit registers, so you'll have to rewrite the machine code. The instruction you want is probably48 B8 (imm64)
(i.e.movq ..., %rax
), and I think the jump instruction can be left alone, but I don't know much about this. You should probably add the 'x86-64' and 'assembly' tags to your question.您可以使用
JMP rel32
(0xE9) 操作从当前地址执行32 位相对跳转。这将允许您跳转到 5 个字节的源地址 2GB 以内的任何位置。它还具有不会破坏 %eax 的优点(这对于您的情况可能很重要,也可能不重要)。也就是说,我建议您查看 kprobes API。这将为您处理运行时修补的所有艰苦工作。它还处理应用于同一功能的多个标记和其他此类麻烦,并且可移植到多个平台。特别是,如果您使用猴子修补方法,则在编译时可能会与标记 API 发生冲突,从而导致崩溃。如果动态可修补代码位于函数的前几个字节(LOCK 前缀等),也会导致崩溃。
您可能还想了解 ftrace 的工作原理 - 取决于内核配置,改为挂钩 ftrace 可能会更快一些。
You can use the
JMP rel32
(0xE9) operation to perform a 32-bit relative jump from the current address. This will allow you to make a jump to anywhere within 2GB of the source address in five bytes. It also has the advantage that it does not clobber %eax (this may or may not be important in your case).That said, I would recommend looking into the kprobes API instead. This handles all the hard work of runtime patching for you. It also deals with multiple markers being applied to the same function and other such nastiness, and is portable to multiple platforms. In particular, if your monkey-patching approach was in use, it could conflict with the markers API if compiled in, resulting in crashes. It would also result in crashes if dynamically patchable code was located in the first few bytes of a function (LOCK prefixes, etc).
You might also want to look into how ftrace works - depending on kernel configuration, it might be somewhat faster to hook into ftrace instead.