关于如何编写调试格式转换工具的建议
我正在寻找编写一个工具,旨在将一种格式的调试符号转换为另一种在 GDB 下兼容使用的格式。这似乎是一个乏味且可能复杂的项目,所以我不确定如何解决它。
最初,我的目标是将 borland 编译器发出的 Turbo 调试符号表(TDS)转换为诸如 stats 或 dwarf 格式(从我的研究来看,似乎 dwarf 更受欢迎)。但理想情况下,我希望将我的工具设计得足够容易扩展,以便以后也可以转换其他格式。例如 codeview4 甚至 pdb。
我创建这个的主要动机是:
- 互操作性。如果我可以将外部调试格式转换为 gdb 可以使用的形式,那么可以对从 gcc 之外的其他编译器编译的二进制文件进行源代码级调试。这意味着任何使用 gdb 作为后端的前端调试界面都可以工作。
- 不存在其他工具。我在谷歌上搜索了类似的工具,找到的最接近的是 tds2dbg。但它并不能完全满足我的要求。
我现在必须处理的事情:
- 我已经有一个可以理解的 debug hook API TDS 调试格式。我可以使用它来帮助我从要转换的源格式中获取所需的信息。
- 对于这个项目的范围,我主要感兴趣的是让它在 win32 环境下工作。我并不真正关心其他平台和工具。
- 我要转换的目标矮人调试格式。这个我实在是一点也不熟悉。我之前使用过像 MinGW 这样的 gcc 移植编译器,并使用 dwarf 格式的 gdb 对其进行调试。但我不知道这种格式在 Windows 上是如何实现的。
最后一点是我关心的。我正在阅读 dwarf 规范文档,但我发现我很难真正理解和理解它的工作原理。那里有很多细节,但同时它没有任何关于 dwarf 如何在原生不使用 ELF 的平台上的对象文件和图像文件上实现的细节——即 Windows 使用的 PE-COFF 格式。该文档读起来也非常枯燥,长句子让人难以理解,图表和插图也很少。我遇到了一个名为 libDwarf 的 API,它应该可以完成解释中的大部分解析工作矮人。问题是我仍在尝试构建它,但我还不知道它会如何实现。
我还没有编写任何代码,因为我不完全理解我需要构建什么。我有一种感觉,由于 dwarf 的复杂性,最大的困难将是弄清楚如何使用 dwarf。谷歌搜索有关 dwarf 在 Windows 下如何工作的信息也没有找到任何有用的信息。例如,没有关于在 PE 可执行映像文件中包含 dwarf 所需的“粘合”代码的信息。矮人部分是如何精确布局的?每个部分都有标题信息吗? GDB 显然不只是获取“原始”矮人调试文件并按原样使用它。那么 gdb 期望调试文件采用什么样的格式才能使用它呢?
我的问题是,我该如何开始这样一个项目?更重要的是,当我不可避免地遇到问题时,我可以向哪里寻求帮助?
I'm looking to write a tool that aims to convert debug symbols of one format to another format that's compatible for use under GDB. This seems like a tedious and potentially complex project so I'm not exactly sure how to tackling it.
Intially I'm aiming to convert the Turbo Debug Symbol table(TDS) emitted from borland compilers into something like stabs or dwarf format(seems like dwarf is prefer from my research). But ideally I want to design my tool to be easy enough to extend so it could convert other formats too later on. e.g. codeview4 or maybe even pdb.
My primary motivation for creating this are:
- Interoperability. If I can convert a foreign debug format into a form gdb can work with then source-level debugging would be possible on binaries compiled from another compiler other than gcc. This means any frontend debugging interface that uses gdb as a backend will work as well.
- No other tools exist. I did a google searching around for similar tools and the closest I've found is tds2dbg. But it doesn't quite do what I'm looking for.
What I have to work with at the moment:
- I already have a debug hook API that can understand the TDS debug format. I can use that to help me get at the needed information from the source format I'm converting from.
- For the scope of this project, I'm mainly interested in getting this to work under the win32 environment. Other platforms and tools I'm not really concerned about.
- The target dwarf debug format I'm converting to. This one I'm really not familiar with at all. I have used gcc ported compilers like MinGW before and debugged them with gdb with the dwarf format. But I don't have any idea how this format is implemented on windows.
The last point is the one I'm concerned about. I'm reading through the dwarf spec documentation but I find I'm having trouble really understanding and comprehending how it works. There's so much detail in there but at the same time it doesn't have any details about how dwarf gets implemented on object files and image files on a platform that doesn't use ELF natively -- namely the PE-COFF format that windows uses. The documentation is also a very dry read, long sentences make it hard to understand and diagrams and illustrations are sparse. I came across an API called libDwarf that should take most of the parsing work out of interpreting dwarf. The problem is I'm still trying to get it to build and I don't know yet how it will work out.
I haven't written any code yet since I don't fully understand what it is I need to build. I have a feeling the biggest hurtle will be figuring out how to work with dwarf due to it's complexity. Googling for information on how dwarf works under windows hasn't turned up anything helpful either. Like for example, there's no information about the 'glue' code that's needed to contain dwarf within a PE executable image file. How are the dwarf sections exactly layed out? Are there any header information for each section? GDB clearly doesn't just take a 'raw' dwarf debug file and use it as is. So what kind of format does gdb expect the debug file to be in for it to be able to work with it?
My question is, how can I start on such a project? More importantly, where can I turn to for help when I inevitably get stuck on a problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
适用于 Windows 的 Affinic Assembler
Affinic Assembler 是适用于 Windows 的 x86/x86-64 汇编器,它采用带有 DWARF 调试信息的 GAS 语法汇编源,并在目标文件中生成相应的 CodeView 格式部分,以便使链接的程序在 Visual Studio 中可调试。该程序非常适合 Cygwin 和 MinGW 用户将 Linux 代码移植到 Windows。
http://www.affinic.com/?page_id=48
Affinic Assembler for Windows
Affinic Assembler is an x86/x86-64 assembler for Windows that takes GAS-syntax assembly source with DWARF debug information and generates corresponding CodeView format sections in object file in order to make the linked program debuggable in Visual Studio. This program is good for Cygwin and MinGW users to port Linux code to Windows.
http://www.affinic.com/?page_id=48
您在这里问了几个问题:-)
我认为您正在朝着正确的方向前进,使用 libdwarf。
但是,您是否看过 objcopy 看看这个工具是否可以为您完成一些工作?它可能不支持 borland、pdb 或 codeview4,但可能值得研究一下。 (另一种方法可能是扩展 objcopy 以支持您尝试在之间转换的格式。)
有时当我遇到困难时,我会使用 dwarf-discuss 邮件列表。
http://lists.dwarfstd.org/listinfo.cgi/dwarf- Discuss-dwarfstd.org
关于矮人的问题,把它们分成单独的问题,我会尽力解决
回答他们。 :-)
You are asking several questions here :-)
I think you are heading in the right direction, using libdwarf.
BUT, have you taken a look at objcopy to see if this tool can do some of the work for you? It probably doesn't support borland, pdb or codeview4, but it might be worth looking into. (Another approach may be to extend objcopy to support the formats you are trying to convert between.)
I have used the dwarf-discuss mailing list sometimes when I have become stuck.
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
As for the questions on dwarf, split them into separate questions and I will do my best to
answer them. :-)