如何获取内存中方法的完整代码,以便我可以在运行时计算其哈希值?
如何获取内存中方法的完整代码,以便我可以在运行时计算其哈希值?
我需要创建一个这样的函数:
type
TProcedureOfObject = procedure of object;
function TForm1.CalculateHashValue (AMethod: TProcedureOfObject): string;
var
MemStream: TMemoryStream;
begin
result:='';
MemStream:=TMemoryStream.Create;
try
//how to get the code of AMethod into TMemoryStream?
result:=MD5(MemStream); //I already have the MD5 function
finally
MemStream.Free;
end;
end;
我使用 Delphi 7。
编辑: 感谢马塞洛·坎托斯和gabr 指出,由于编译器优化,没有一致的方法来查找过程大小。感谢 Ken Bourassa 提醒我风险。目标过程(我想要计算哈希的过程)是我自己的,我不会从那里调用另一个例程,所以我可以保证它不会改变。
在阅读了有关 $O 指令的答案和 Delphi 7 帮助文件后,我有了一个想法。
我将像这样制作目标过程:
procedure TForm1.TargetProcedure(Sender: TObject);
begin
{$O-}
//do things here
asm
nop;
nop;
nop;
nop;
nop;
end;
{$O+}
end;
过程末尾的 5 个连续的 nop
将充当书签。人们可以用 gabr 的技巧来预测程序的结束,然后扫描附近的 5 个 nop 以找出希望正确的大小。
现在,虽然这个想法听起来值得尝试,但我……呃……不知道如何将其放入工作的 Delphi 代码中。我没有较低级别的编程经验,例如如何在扫描 5 个 nop
时获取入口点并将目标过程的整个代码放入 TMemoryStream
中。
如果有人能给我展示一些实际的例子,我将非常感激。
How to get the entire code of a method in memory so I can calculate its hash at runtime?
I need to make a function like this:
type
TProcedureOfObject = procedure of object;
function TForm1.CalculateHashValue (AMethod: TProcedureOfObject): string;
var
MemStream: TMemoryStream;
begin
result:='';
MemStream:=TMemoryStream.Create;
try
//how to get the code of AMethod into TMemoryStream?
result:=MD5(MemStream); //I already have the MD5 function
finally
MemStream.Free;
end;
end;
I use Delphi 7.
Edit:
Thank you to Marcelo Cantos & gabr for pointing out that there is no consistent way to find the procedure size due to compiler optimization. And thank you to Ken Bourassa for reminding me of the risks. The target procedure (the procedure I would like to compute the hash) is my own and I don't call another routines from there, so I could guarantee that it won't change.
After reading the answers and Delphi 7 help file about the $O directive, I have an idea.
I'll make the target procedure like this:
procedure TForm1.TargetProcedure(Sender: TObject);
begin
{$O-}
//do things here
asm
nop;
nop;
nop;
nop;
nop;
end;
{$O+}
end;
The 5 succesive nop
s at the end of the procedure would act like a bookmark. One could predict the end of the procedure with gabr's trick, and then scan for the 5 nops nearby to find out the hopefully correct size.
Now while this idea sounds worth trying, I...uhm... don't know how to put it into working Delphi code. I have no experience on lower level programming like how to get the entry point and put the entire code of the target procedure into a TMemoryStream
while scanning for the 5 nop
s.
I'd be very grateful if someone could show me some practical examples.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
马塞洛正确地指出,这通常是不可能的。
通常的解决方法是使用要计算哈希值的方法的地址和下一个方法的地址。目前,编译器按照源代码中定义的顺序排列方法,并且这个技巧有效。
请注意,减去两个方法地址可能会得到稍微太大的结果 - 第一个方法实际上可能在下一个方法开始之前结束几个字节。
Marcelo has correctly stated that this is not possible in general.
The usual workaround is to use an address of the method that you want to calculate the hash for and an address of the next method. For the time being the compiler lays out methods in the same order as they are defined in the source code and this trick works.
Be aware that substracting two method addresses may give you a slightly too large result - the first method may actually end few bytes before the next method starts.
我能想到的唯一方法是打开 TD32 debuginfo,并尝试 JCLDebug 看看是否可以使用它找到 debuginfo 中的长度。重定位不应影响长度,因此二进制文件中的长度应与 mem 中的长度相同。
另一种方法是扫描代码中的 ret 或 ret 操作码。这不太安全,但可能会保护至少部分功能,而不必弄乱调试信息。
不过,潜在的破坏因素是尾调用优化的短例程(它们跳转而不是 ret)。但我不知道Delphi是否这样做。
The only way I can think of, is turning on TD32 debuginfo, and try JCLDebug to see if you can find the length in the debuginfo using it. Relocation shouldn't affect the length, so the length in the binary should be the same as in mem.
Another way would be to scan the code for a ret or ret opcode. That is less safe, but probably would guard at least part of the function, without having to mess with debuginfo.
The potential deal breaker though is short routines that are tail-call optimized (iow they jump instead of ret). But I don't know if Delphi does that.
你可能会为此苦苦挣扎。函数是由它们的入口点定义的,但我认为没有任何一致的方法来找出大小。事实上,优化器可以做一些奇怪的事情,比如将两个相似的函数合并到一个具有多个入口点的公共共享函数中(我不知道 Delphi 是否做这样的事情)。
编辑: 5-nop 技巧也不能保证有效。除了 Remy 的警告(请参阅下面他的评论)之外,编译器只需保证 nop 是最后执行的事情,而不是它们是最后出现在函数二进制映像中的事情。关闭优化是一个相当巴洛克的“解决方案”,它仍然无法解决其他人提出的所有问题。
简而言之,对于您想要做的事情来说,这里有太多的变量。更好的方法是针对编译单元进行校验和(假设它满足您的总体目标)。
You might struggle with this. Functions are defined by their entry point, but I don't think that there is any consistent way to find out the size. In fact, optimisers can do screwy things like merge two similar functions into a common shared function with multiple entry points (whether or not Delphi does stuff like this, I don't know).
EDIT: The 5-nop trick isn't guaranteed to work either. In addition to Remy's caveats (see his comment below), The compiler merely has to guarantee that the nops are the last thing to execute, not that they are last thing to appear in the function's binary image. Turning off optimisations is a rather baroque "solution" that still won't fix all the issues that others have raised.
In short, there are simply too many variables here for what you are trying to do. A better approach would be to target compilation units for checksumming (assuming it satisfies whatever overall objective you have).
我通过让 Delphi 生成一个 MAP 文件并根据符号的起始地址按升序对符号进行排序来实现此目的。每个过程或方法的长度就是下一个符号起始地址减去该符号起始地址。这很可能与此处建议的其他解决方案一样脆弱,但我现在正在生产中使用此代码,并且到目前为止它对我来说工作得很好。
我的读取地图文件并计算大小的实现可以找到 此处位于第 3615 行(TEditorForm.RemoveUnusedCode)。
I achieve this by letting Delphi generate a MAP-file and sorting symbols based on their start address in ascending order. The length of each procedure or method is then the next symbols start address minus this symbols start address. This is most likely as brittle as the other solutions suggested here but I have this code working in production right now and it has worked fine for me so far.
My implementation that reads the map-file and calculate sizes can be found here at line 3615 (TEditorForm.RemoveUnusedCode).
即使你能实现它,你也需要注意一些事情......
即使函数本身没有改变,哈希也会改变很多次。
例如,如果您的函数调用自上次构建以来更改了地址的另一个函数,则哈希值将会更改。我认为如果您的函数递归地调用自身并且您的单元(不一定是您的函数)自上次构建以来发生了变化,则哈希也可能会发生变化。
至于如何实现,gabr 的建议 似乎是最好的......但随着时间的推移,它确实很容易崩溃。
Even if you would achieve it, there is a few things you need to be aware of...
The hash will change many times, even if the function itself didn't change.
For example, the hash will change if your function call another function that changed address since the last build. I think the hash might also change if your function calls itself recursively and your unit (not necessarily your function) changed since the last build.
As for how it could be achieved, gabr's suggestion seems to be the best one... But it's really prone to break over time.