There are a few things that help with reversing delphi programs:
You get the full form data including the name of event handler methods
All members with published visibility have metadata used with RTTI
The compiler is pretty bad at optimizing. It does no whole program optimization and the assembly is usually a straight forward translation of the original source with only minor optimizations. (At least it was in the versions I used, might have improved since then)
All classes, even those compiled with RTTI off have some level of metadata available. In particular it's possible to get the name and inheritance structure of classes. And for any instance of a class you happen to see in the debugger you can get its VMT and thus its class name.
Delphi uses textfiles describing the content of your form and hooks up event handlers by name. This approach obviously needs enough metadata to deserialize that textual representation of a from and hook up the eventhandlers by name.
An alternative some other GUI toolkits use is auto-generating code that initializes the form and hooks up the event handler with code. Since this code directly uses pointers to the eventhandlers and directly assigns to properties/calls setters it doesn't need any metadata. Which has the side-effect that reversing becomes a bit harder.
It shouldn't be too hard to create a program that transforms a dfm file into a series of hardcoded instructions that creates the form instead. So a tool like DeDe won't work that well anymore. But that doesn't gain you much in practice.
But figuring out which evenhandler corresponds to which control/event is still rather easy. Especially since stuff like FLIRT identifies most library functions. So you just need to breakpoint the one you're interested in and then step into the user code.
If you were able to prove that the results of decompiling a Delphi executable were of significantly higher quality than in other widely used languages then your question would carry more weight.
Story from the trenches: Decompiling a tiny Delphi DLL
I've been through a Delphi decompiling session myself. It was one of those fake-sounding "I lost my sources" thing, I really did lose the sources for a tiny Firebird UDF library. Now I do no better, I didn't jump right into decompiling because the library was so small and I knew a rewrite would be much faster.
This DLL exports a function that looks like this:
function udf_do_some_math(Number1, Number2:Currency): Currency;
After doing the sane thing and rewriting the function and doing some regression tests I discovered some obscure corner-cases where the new function's result wasn't the same as the old function's result! The trouble was, the new function's result was the correct result, the old DLL contained a BUG and I had to reproduce the BUG - with this function consistency is more important then accuracy.
Again, did the sane thing and tried to "guess" at the BUG. I knew it was a rounding issue but simply couldn't figure out what it was. Finally I decided to give decompilers I try. After all this was a small library, the entry-point was straight-forward and I didn't really need re-compilable code, nor 100% decompilation: I only needed enough to figure out the old BUG so I can reproduce it!
Decompiling failed! I tried lots of different decompilers, including a couple of "commercial" ones. Most produced what on the surface looked like good data, but not enough to figure out the old bug. The most promising one, the one with version specific knowledge of the VCL and RTL gave the worst failure: sure, it figured out the RTL calls, gave them names, but failed to locate the exported function! The one function I was interested in wasn't shown int the list of entry points, and it should have been straight forward since it's an exported function.
This decompiling attempt should have been easy because:
The code was fairly simple and not a lot of it.
It was a DLL with an exported function, none of the complexity you'd expect from an event-driven exe.
I wasn't interested in re-compilable code, I simply wanted to find an old bug so I can reproduce it.
I didn't ask for Pascal code, assembler would've been good enough.
I knew precisely what the code was doing and how it was doing it. It wasn't cryptic 3rd party code.
My solution
After decompilers failed me I turned to my own trusty Delphi IDE for debugging. I wrote a small Delphi program that directly imports the function from the DLL, created a fake Firbird memory manager DLL so my DLL can load, called my old function with the parameters I knew would give bad results, steped into the code using the debugger and closely watched the FPU registers. After a few failed attempts I finally noticed a value was popped from the FPU stack as integer where it shouldn't have been Integer so I had my BUG: I mistakenly defined an Integer local variable where I should have used Currency. Armed with that knowledge I was able to reproduce the bug.
Only thing that is easier in Delphi is retrieving VCLs. After using decompilers like DeDe you will get application user interface but without any logic. So if you want to retrieve only forms and buttons - Delphi is easier than other compilers, but if you want to know what is going on after clicking on the button you'll need to use ollydbg or other (debugger/disassembler) as for other languages that creates executables.
There are pros and cons. I am not sure what angle your referring to as being easier. There is also a huge difference in a 1 form simple application, versus a very in-depth application that has many forms and tons of classes and functions. It's like Notepad versus Office 2013 (given they were coded in delphi, just an example comparing complexity not language).
In a small app, having the extra information that Delphi apps "usually" contain can make it a breeze. However, in a large application it may "help", but you have a million calls to dig through. They may help you get in the near vicinity, but calls inside of calls inside of calls, then multiple returns used as jumps... makes you dizzy. Then if the app "was" packed or protected, some things can still be a garbled mess. While it may work programming wise, reading it can be a lot harder. I was in one the other day, where all of the strings were encrypted, so "referenced text strings" were no help, and the encryption was not a simple md5 or base64, it was some custom algorithm. Maybe an MD5 with a salt, then base64 encoded? I never could get to the exact method on the strings. I knew what some of them were supposed to be, but couldn't reproduce the method, even though it looked like it was base64, it was the base64 of the string already encrypted some how... I dont rely on text strings, but in a large large app, every little bit helps.
Of course, my interpretation of this question, was looking at a Delphi exe in OllyDbg. I could be off base on where you guys were going with this topic, but I feel in regards to Olly and reversing, I am on point (if that was what you were talking about) lol.
发布评论
评论(5)
有一些东西可以帮助逆向 delphi 程序:
published
可见性的成员都具有与 RTTI 一起使用的元数据类
的名称和继承结构。对于您在调试器中碰巧看到的类的任何实例,您都可以获得其 VMT 及其类名。Delphi 使用文本文件描述表单的内容并按名称连接事件处理程序。这种方法显然需要足够的元数据来反序列化 from 的文本表示并按名称连接事件处理程序。
其他一些 GUI 工具包使用的另一种方法是自动生成代码,用于初始化表单并使用代码连接事件处理程序。由于此代码直接使用指向事件处理程序的指针并直接分配给属性/调用设置器,因此不需要任何元数据。这有一个副作用,就是倒车变得有点困难。
创建一个将 dfm 文件转换为一系列创建表单的硬编码指令的程序应该不会太难。所以像 DeDe 这样的工具将不再那么好用了。但这在实践中并没有给你带来太多好处。
但是弄清楚哪个事件处理程序对应于哪个控件/事件仍然相当容易。特别是像 FLIRT 这样的东西可以识别大多数库函数。因此,您只需在您感兴趣的断点处设置断点,然后单步执行用户代码即可。
There are a few things that help with reversing delphi programs:
published
visibility have metadata used with RTTIclasses
. And for any instance of a class you happen to see in the debugger you can get its VMT and thus its class name.Delphi uses textfiles describing the content of your form and hooks up event handlers by name. This approach obviously needs enough metadata to deserialize that textual representation of a from and hook up the eventhandlers by name.
An alternative some other GUI toolkits use is auto-generating code that initializes the form and hooks up the event handler with code. Since this code directly uses pointers to the eventhandlers and directly assigns to properties/calls setters it doesn't need any metadata. Which has the side-effect that reversing becomes a bit harder.
It shouldn't be too hard to create a program that transforms a dfm file into a series of hardcoded instructions that creates the form instead. So a tool like DeDe won't work that well anymore. But that doesn't gain you much in practice.
But figuring out which evenhandler corresponds to which control/event is still rather easy. Especially since stuff like FLIRT identifies most library functions. So you just need to breakpoint the one you're interested in and then step into the user code.
你的说法是错误的。 Delphi 并不比其他主流编译器生成的代码更容易反编译。
如果您能够证明反编译 Delphi 可执行文件的结果的质量明显高于其他广泛使用的语言,那么您的问题就会更有分量。
The statement you make is false. Delphi is not particularly more easy to decompile than code produced by other mainstream compilers.
If you were able to prove that the results of decompiling a Delphi executable were of significantly higher quality than in other widely used languages then your question would carry more weight.
来自战壕的故事:反编译一个微小的 Delphi DLL
我自己也经历过 Delphi 反编译会话。这是那些听起来很假的“我丢失了源代码”的事情之一,我确实丢失了微小 Firebird UDF 库的源代码。现在我也没有做得更好,我没有直接进行反编译,因为库太小了,而且我知道重写会快得多。
该 DLL 导出一个如下所示的函数:
在做了正常的事情并重写该函数并进行了一些回归测试之后,我发现了一些模糊的极端情况,其中新函数的结果与旧函数的结果不同!问题是,新函数的结果是正确的结果,旧的 DLL 包含一个 BUG,我必须重现该 BUG - 对于这个函数,一致性比准确性更重要。
再次,做了理智的事情并尝试“猜测”BUG。我知道这是一个四舍五入的问题,但就是不明白它是什么。最后我决定给出我尝试的反编译器。毕竟这是一个小库,入口点很简单,我并不需要重新编译代码,也不需要 100% 反编译:我只需要足够的数据来找出旧的 BUG,这样我就可以重现它!
反编译失败!我尝试了很多不同的反编译器,包括几个“商业”反编译器。大多数都产生了表面上看起来不错的数据,但不足以找出旧的错误。最有前途的一个,具有 VCL 和 RTL 版本特定知识的那个,却遭遇了最严重的失败:当然,它找出了 RTL 调用,给了它们名称,但未能找到导出的函数!我感兴趣的一个函数没有显示在入口点列表中,它应该是直接的,因为它是一个导出函数。
这种反编译尝试应该很容易,因为:
我的解决方案
在反编译器失败后,我转向我自己值得信赖的 Delphi IDE 进行调试。我编写了一个小型 Delphi 程序,直接从 DLL 导入函数,创建了一个假的 Firbird 内存管理器 DLL,以便我的 DLL 可以加载,使用我知道会产生不良结果的参数调用我的旧函数,单步执行使用调试器进入代码并密切监视 FPU 寄存器。经过几次失败的尝试后,我终于注意到从 FPU 堆栈中弹出了一个值作为整数,它不应该是 Integer,所以我遇到了 BUG:我错误地定义了一个 Integer 局部变量,而我应该在其中定义一个 Integer 局部变量使用的货币。有了这些知识,我就能够重现该错误。
Story from the trenches: Decompiling a tiny Delphi DLL
I've been through a Delphi decompiling session myself. It was one of those fake-sounding "I lost my sources" thing, I really did lose the sources for a tiny Firebird UDF library. Now I do no better, I didn't jump right into decompiling because the library was so small and I knew a rewrite would be much faster.
This DLL exports a function that looks like this:
After doing the sane thing and rewriting the function and doing some regression tests I discovered some obscure corner-cases where the new function's result wasn't the same as the old function's result! The trouble was, the new function's result was the correct result, the old DLL contained a BUG and I had to reproduce the BUG - with this function consistency is more important then accuracy.
Again, did the sane thing and tried to "guess" at the BUG. I knew it was a rounding issue but simply couldn't figure out what it was. Finally I decided to give decompilers I try. After all this was a small library, the entry-point was straight-forward and I didn't really need re-compilable code, nor 100% decompilation: I only needed enough to figure out the old BUG so I can reproduce it!
Decompiling failed! I tried lots of different decompilers, including a couple of "commercial" ones. Most produced what on the surface looked like good data, but not enough to figure out the old bug. The most promising one, the one with version specific knowledge of the VCL and RTL gave the worst failure: sure, it figured out the RTL calls, gave them names, but failed to locate the exported function! The one function I was interested in wasn't shown int the list of entry points, and it should have been straight forward since it's an exported function.
This decompiling attempt should have been easy because:
My solution
After decompilers failed me I turned to my own trusty Delphi IDE for debugging. I wrote a small Delphi program that directly imports the function from the DLL, created a fake Firbird memory manager DLL so my DLL can load, called my old function with the parameters I knew would give bad results, steped into the code using the debugger and closely watched the FPU registers. After a few failed attempts I finally noticed a value was popped from the FPU stack as integer where it shouldn't have been Integer so I had my BUG: I mistakenly defined an Integer local variable where I should have used Currency. Armed with that knowledge I was able to reproduce the bug.
Delphi 中唯一更容易的事情是检索 VCL。
使用 DeDe 等反编译器后,您将获得应用程序用户界面,但没有任何逻辑。
因此,如果您只想检索表单和按钮 - Delphi 比其他编译器更容易,但是如果您想知道单击按钮后发生了什么,则需要使用 ollydbg 或其他(调试器/反汇编器)创建可执行文件的语言。
Only thing that is easier in Delphi is retrieving VCLs.
After using decompilers like DeDe you will get application user interface but without any logic.
So if you want to retrieve only forms and buttons - Delphi is easier than other compilers, but if you want to know what is going on after clicking on the button you'll need to use ollydbg or other (debugger/disassembler) as for other languages that creates executables.
有优点也有缺点。我不确定你指的是哪个角度更容易。单一形式的简单应用程序与具有多种形式以及大量类和函数的非常深入的应用程序之间也存在巨大差异。这就像记事本与 Office 2013(假设它们是用 delphi 编码的,只是比较复杂性而不是语言的示例)。
在小型应用程序中,拥有 Delphi 应用程序“通常”包含的额外信息可以使其变得轻而易举。然而,在大型应用程序中,它可能会“有所帮助”,但您有一百万个调用需要挖掘。它们可能会帮助你到达附近,但是调用中的调用中的调用中的调用,然后用作跳转的多个返回......让你头晕。然后,如果应用程序“被”打包或保护,有些东西仍然可能是乱码。虽然它可能适合编程,但阅读它可能会困难得多。有一天,我遇到了所有字符串都被加密的情况,因此“引用的文本字符串”没有任何帮助,而且加密不是简单的 md5 或 base64,而是某种自定义算法。也许是带盐的 MD5,然后进行 Base64 编码?我始终无法掌握弦乐的确切方法。我知道其中一些应该是什么,但无法重现该方法,即使它看起来像是base64,它是已经以某种方式加密的字符串的base64...我不依赖文本字符串,但是在大型应用程序中,每一点都有帮助。
当然,我对这个问题的解释是查看 OllyDbg 中的 Delphi exe。我可能会偏离你们对这个话题的看法,但我觉得关于奥利和逆转,我说得对(如果这就是你们所说的)哈哈。
There are pros and cons. I am not sure what angle your referring to as being easier. There is also a huge difference in a 1 form simple application, versus a very in-depth application that has many forms and tons of classes and functions. It's like Notepad versus Office 2013 (given they were coded in delphi, just an example comparing complexity not language).
In a small app, having the extra information that Delphi apps "usually" contain can make it a breeze. However, in a large application it may "help", but you have a million calls to dig through. They may help you get in the near vicinity, but calls inside of calls inside of calls, then multiple returns used as jumps... makes you dizzy. Then if the app "was" packed or protected, some things can still be a garbled mess. While it may work programming wise, reading it can be a lot harder. I was in one the other day, where all of the strings were encrypted, so "referenced text strings" were no help, and the encryption was not a simple md5 or base64, it was some custom algorithm. Maybe an MD5 with a salt, then base64 encoded? I never could get to the exact method on the strings. I knew what some of them were supposed to be, but couldn't reproduce the method, even though it looked like it was base64, it was the base64 of the string already encrypted some how... I dont rely on text strings, but in a large large app, every little bit helps.
Of course, my interpretation of this question, was looking at a Delphi exe in OllyDbg. I could be off base on where you guys were going with this topic, but I feel in regards to Olly and reversing, I am on point (if that was what you were talking about) lol.