检测DLL代码拼接
我正在尝试编写一些函数来检测 DLL 代码拼接。我认为 dll 代码拼接意味着修改加载的 dll 中函数开头的字节,这样它就不会跳转到 dll 中的完整函数实现,而是跳转到其他位置。
到目前为止我的方法是:
首先 - 加载的 dll 信息(例如加载的 dll 的图像库等)是我通过使用 Toolhelp32 库获得的。
对于每个加载的 dll:
- 通过读取内存中 dll 的导出表来获取每个函数地址 (rva),在内存中的
- 该地址处读取 8 个字节,
- 从磁盘上的 dll 版本获取函数 rva
- 解析磁盘上 dll 的 PE 标头,将 rva 转换为文件偏移量 - 在这里也读取 8 个字节
- 比较这 8 个字节
现在我知道我没有做正确的事情,而且我可能犯了概念性错误。
我一直在用notepad.exe,32位进行测试。对于加载的 DLL 中的大多数函数,比较都是成功的,但往往会发现一些差异。
例如:
ntdll.dll:序号=00000059,rva=0007e098,fileoffs=0007d498,函数VA:7c97e098
磁盘:00 00 00 00 00 00 00 00
内存: e4 04 00 00 00 00 00 00
和:
ntdll.dll: ordinal=0000003d, rva=0009d0d8, fileoffs=0009c4d8 函数 VA: 77a9d0d8
磁盘:a1 5c 81 f9 77 c3 90 90
mem: a1 5c 81 ad 77 c3 90 90
有人跟我提到这与搬迁有关。然而,我无法弄清楚这一点,而且我还没有找到任何关于如何应用它的文档。
有人有这方面的信息或链接吗?或者有谁知道我哪里失败了? 非常感谢。
编辑: DLL 的正在其首选映像基址处加载(当将OptionalHeader.ImageBase 与内存中已加载模块的基址进行比较时)。
因此,我一直试图找出为什么会存在差异 - 例如。上图:为什么 ntdll 中的第 1312 个函数似乎匹配,但第 1313 个函数却不匹配。
I'm trying to write some functions to detect DLL code splicing. I take dll code splicing to mean modifying the bytes at the start of functions in loaded dll's, so that instead of jumping to the full function implementation within the dll, it will jump to some other location.
My approach so far has been:
Firstly - loaded dll information (eg. image base of loaded dll, etc.) I get from using Toolhelp32 libraries.
For each loaded dll:
- get each function address (rva) by reading export table, in memory, of the dll
- read in 8 bytes at this address in memory
- get function rva from version of dll on disk
- parse PE header of dll-on-disk, to convert rva to file-offset - read 8 bytes here too
- compare these 8 bytes
Now I know I'm not doing something quite right, and I may be making a conceptual blunder.
I've been testing with notepad.exe, 32bit. The comparisons succeed for the majority of the functions in the loaded DLL's but it tends to find some differences.
For example:
ntdll.dll: ordinal=00000059, rva=0007e098, fileoffs=0007d498, function VA: 7c97e098
disk: 00 00 00 00 00 00 00 00
mem: e4 04 00 00 00 00 00 00
and:
ntdll.dll: ordinal=0000003d, rva=0009d0d8, fileoffs=0009c4d8 function VA: 77a9d0d8
disk: a1 5c 81 f9 77 c3 90 90
mem: a1 5c 81 ad 77 c3 90 90
Someone mentioned to me that it has something to do with relocations. I can't figure this out, however, and I haven't found any documentation on how this applies here.
Does anyone have some info, or links on this? Or does anyone know where I am failing?
Many thanks in advance.
EDIT: The DLL's are being loaded at their preferred image base (when comparing the OptionalHeader.ImageBase to the base address of the loaded module in memory).
Therefore I'm stuck trying to figure out why there could be a difference - eg. above: why 1312 functions in ntdll seemed to match, but the 1313'th one doesnt.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
重定位是包含绝对地址的虚拟偏移列表。如果图像未在其首选图像库加载,则需要调整重定位表中列出的所有偏移量。如果您的首选映像基址是 0x400000,并且 DLL 在 0x500000 处加载,则只需将重定位列表中提到的偏移量处的数据调整为 0x100000。
请参阅PE 内部的对等互连中的“PE 文件基重定位”部分对于格式。
Relocations are a list of virtual offsets which contains absolute addresses. If an image isn't loaded at its preferred image base, all offsets listed in the relocation table needs to be adjusted. If you'r preferred image base is 0x400000 and the DLL loads at 0x500000, you simply need to adjust the data at the offsets mentioned in the relocation list with 0x100000.
See e.g. the "PE File Base Relocations" section in Peering inside the PE for the format.