克隆汇编指令所需的完整信息是什么
我想要编写读取汇编指令(仅限 x86)并在内存的其他位置重新创建它们的代码,以便制作挂钩代码。就像,我想挂钩函数 X,所以我需要用跳转来修补它的(至少)第一个字节,并且我替换的每条指令(根据汇编代码可能会有所不同)(部分或不部分)我需要在内存中重新创建我的块,然后添加一条指令,从我没有触及的下一条指令的偏移量跳回原始函数 X。你可能知道我在说什么,因为这对很多人来说并不新鲜。我不想制作一个完整的完美程序,但我想制作一个完全可扩展的代码库,它将使用我将在下面解释的树。首先,让我们想象一些指令:
- A - “0x12 0x13 . .” 该指令有 4 个字节,前两个是静态的。
- B - “0x12”。该指令有 2 个字节,第一个字节是静态的。
对于这种情况,我会有一棵看起来像这样的树,
Tree
|
|
0x12
/ \
B 0x13
|
A
所以当代码解析一条指令时,它会尝试到达具有最长前缀的指令,如果失败,可能会停止并失败,或者尝试树中上面的指令。
想要制作这样的东西的原因是,我可以稍后使用 dll 提供的指令进行扩展,这对于我正在做的事情来说是必须的,因为我想更快地发布代码,它将处理大约 90% 的指令,并且只需要小心那些更先进的,以防我将来需要。
所以,现在我的问题是:处理代码指令的 dll 需要的确切完整信息是什么? 比如:
- 指令开始的地址。 (当然是必须的)
- ?包含指令开始地址的模块的基地址(我想如果指令引用其模块内存的某些部分,则需要这个地址)
- ?先前的指令。不知道是否有指令需要知道之前的指令是什么或类似的东西
我也想问一下树结构是否正常或者是否会有一些问题。
所以,基本上我想请你帮助决定我需要什么信息来创建最通用的可能代码:
给定一个地址,解析其汇编指令,并根据指令调用 dll 中的函数指针,以复制这些指令。
所以,像
void* copy_instructions(void* address,int& len)
{
int bytes_copied = 0;
void* instructions = block of bytes // don't care about the implementation
do
{
void (*copy_instruction)(void*,int*) = get_a_handler_to_instruction_at(address) // this function will use the tree structure and retrieve a function from a dll
if(copy_instruction != NULL)
int len = 0;
void* instruction = copy_instruction(void* address,&len,...) // I want to know how to make this function complete in terms of what it need for every case
if(!instruction)
fail
instructions += instruction // don't care about the implementation
address += len
bytes_copied += len
else
fail
}
while(bytes_copied < 5)
add_instructions_jump_to(instructions,address + bytes_copied)
len = bytes_copied;
return
}
我的问题是:
一个完整的“copy_instruction”函数头是什么样子的? 上面提到的树可以实现“get_a_handler_to_instruction_at”还是我需要其他东西。
I'm want to make code that reads assembly instructions (x86 only) and recreates them in other place of memory in order to make hook code. Like, I want to hook function X so I need to patch its (at least) first bytes with a jump and every instruction that I replace (what can vary according the assembly code) (partially or not) I need to recreate in a memory block of mine and then add an instruction to jump back to the original function X from the offset of the next instruction that I didn't touch. You probably know what I'm saying since it isn't new for many. I don't want to make a complete perfect program but I want to make a fully extensible code base that would use a tree like I will explain below. To begin let's imagine some instructions:
- A - "0x12 0x13 . . " this instruction has 4 bytes and the first two are static.
- B - "0x12 . " this instruction has 2 bytes and the first one is static.
For this case I would have a tree that would look like
Tree
|
|
0x12
/ \
B 0x13
|
A
So when the code were to parse an instruction it would try to reach the instruction with the longest prefix and if it failed could stop and fail or try one above in the tree.
The reasoning to wanting to make something like this is that I can extend later with instructions provided by dlls that is a must for what I'm doing because I want to ship the code sooner that will handle like 90% of instructions and only take care of those more advanced in case I need in the future.
So, now my question is: what is the exact full information that a dll that would handle a code instruction would need?
Like:
- the address where the instruction starts. (a must of course)
- ? the base address of the module that contains the address where the instruction starts (I suppose this one is need in case that the instruction references some portion of the memory of its module)
- ? a previous instruction. Don't know if there are instructions that need to know what the instruction before it did or something like that
I also want to ask if the tree structure is ok or if there is some problem I will have.
So, basically I want to ask you for help deciding what is the information I need to create the most generic possible code that:
given an address, parses its assembly instructions and according to the instruction will call function pointers in dlls that will copy those instructions.
So, having something like
void* copy_instructions(void* address,int& len)
{
int bytes_copied = 0;
void* instructions = block of bytes // don't care about the implementation
do
{
void (*copy_instruction)(void*,int*) = get_a_handler_to_instruction_at(address) // this function will use the tree structure and retrieve a function from a dll
if(copy_instruction != NULL)
int len = 0;
void* instruction = copy_instruction(void* address,&len,...) // I want to know how to make this function complete in terms of what it need for every case
if(!instruction)
fail
instructions += instruction // don't care about the implementation
address += len
bytes_copied += len
else
fail
}
while(bytes_copied < 5)
add_instructions_jump_to(instructions,address + bytes_copied)
len = bytes_copied;
return
}
My questions would be:
How would a complete "copy_instruction" function header look like?
Is the tree mentioned above ok to implement "get_a_handler_to_instruction_at" or I need something else.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为了挂钩一个函数,您需要:
Jcc
、JMP close
、CALL close
代码>,JMP/CALL qword ptr [RIP+something]
、MOV EAX、dword ptr [RIP+something]
),如果是这样,则为目标地址。JMP
指令,该指令跳转到原始函数中第一条未触及(通过覆盖)的指令。在此之后,在大多数情况下,挂钩应该可以正常工作。如果编译器生成的任何其他代码期望原始指令位于原始位置并且保持不变,则会出现问题。
至于数据结构,您替换了原始代码的
N
字节。对于 32 位跳转,N
为 5。这些N
字节最多对应于N
个原始指令。您需要完整保存这 1 到 N 条指令(每条指令最多 15 个字节长,IIRC),然后解析,可能会调整并存储在新位置。这里实际上不需要一棵树,一个数组就足够了。每条指令一个元素。简单的。但这是相当多的代码需要仔细编写和调试/测试。请参阅相关问题。可能有有价值的细节。
编辑:回答主要问题:
我认为,“复制”所有指令的主函数(copy_instructions())确实可以按照您的定义进行定义。不过,您可能希望从中返回错误代码,以防万一失败(分配内存或反汇编未知指令或其他内容)。这可能会有帮助。我看不出您还需要从呼叫者那里得到什么/为呼叫者提供什么。
In order to hook a function you'll need to:
Jcc
,JMP near
,CALL near
,JMP/CALL qword ptr [RIP+something]
,MOV EAX, dword ptr [RIP+something]
) and, if this is so, the target address.Jcc
) or even 16-bit) which are insufficiently short for simple direct patching. In this case you will need to reassemble such instructions with longer relative addresses (this will require either inserting/changing an instruction prefix or changing the Mod/RM/SIB bytes). Keep in mind that the relative addresses are relative to the instruction's end (or, IOW, beginning of the next instruction), which means if the adjusted instruction is longer than the original, the relative address will have to account for the instruction length difference as well. Ideally, you should also be prepared to handle the case when the original instructions, which you overwrite, jmp to one another. You don't want their copies to jump back to the overwritten code.JMP
instruction that jumps to the first untouched (by overwriting) instruction in the original function.After this in most situations hooking should just work. The problems will arise if there's any other code generated by the compiler that expects the original instructions at their original place and unchanged.
As for the data structure, you replace
N
bytes of the original code.N
is 5 for a 32-bit jump. ThoseN
bytes will correspond to at mostN
original instructions. You'll need to save those 1 to N instructions in their entirety (every instruction is at most 15-bytes long, IIRC), then parse, possibly adjust and store in the new place. You don't really need a tree here, an array would suffice. An element per instruction. Simple. But it's quite some code that needs to be carefully written and debugged/tested.Please see the related questions. There may be valuable details.
EDIT: Answering the main question:
I think, the main function to "copy" all instructions (copy_instructions()) may indeed be defined as you've defined it. You may want to return an error code from it, though, in case it fails (to allocate memory or disassemble unknown instruction or something else). It may be helpful. I can't see what else you'd need from/for the caller.