我可以获得 C/C++ 的 XML AST 转储吗?不使用编译器而用 clang 编写代码?
我成功地使用 cmake 和 Visual Studio 10 为 Windows 编译了 clang。我想获得一个 XML 文件作为源代码的 AST 表示。有一个选项可以在 linux (ubuntu) 下使用 clang 和 gcc 提供结果,但在 Windows 上不起作用:
clang -cc1 -ast-print-xml source.c
但是,这是调用编译阶段(我想避免)。到目前为止,挖掘源代码对我没有帮助,因为我对 clang 还很陌生。我可以通过以下方式设法生成 AST 的二进制版本:
clang -emit-ast source.c
不幸的是,这种格式无法直接用于解析。是否有一些现有的方法可以直接生成 XML 树而不是 clang 中的二进制树?
目标是在 .NET 环境中的其他工具中使用 XML 表示,因此我需要对本机 clang lib 进行一些包装以访问二进制 AST。如果有人已经为 .NET 编写了一些二进制 clang AST 解析器,也许还有第三种选择?
我是否可能遗漏了一些东西,例如 clang 前端生成的 AST 与编译阶段生成的 AST 不相等。
I managed to compile successfully clang for windows with cmake and visual studio 10. I would like to get an XML file as AST representation of the source code. There is one option that provides the result with clang with gcc under linux (ubuntu) but doesn't work on the windows box:
clang -cc1 -ast-print-xml source.c
However, this is invoking the compilation stage (which I would like to avoid). Digging in the source code didn't help me so far as I am quite new to clang. I could manage to generate the binary version of the AST by using:
clang -emit-ast source.c
Unfortunately, this format is unusable directly for parsing. Is there some existing method to directly generate the XML tree instead of a binary one in clang?
The goal is to use the XML representation in other tools in the .NET environment so I would need to make some wrapping around the native clang lib to access the binary AST. Maybe there is a third option if someone already wrote some binary clang AST parser for .NET?
Is it possible that I am missing something like if the AST generated by the clang front end is not equivalent to the one generated in the compilation stage.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
供您参考,Douglas Gregor(CLang FrontEnd 负责人)已从 2.9 版本中删除了 XML 打印机。
问题是缺少 XML 打印机。许多 AST 节点以及某些节点的许多属性从未在打印机中实现,这导致源代码的表示不准确。
Douglas 提出的另一点是,输出不应该适合调试 CLang 本身(这就是
-emit-ast
的用途),而是适合外部工具使用。这要求输出从一个版本到另一版本保持稳定。值得注意的是,它不应该是 CLang 内部的一对一映射,而是将源代码翻译成标准化语言。除非打印机上有大量工作(需要志愿者),否则它不会被集成回来......
For your information, the XML printer has been removed from the 2.9 version by Douglas Gregor (responsible of CLang FrontEnd).
The issue was that the XML printer was lacking. A number of the AST nodes had never been implemented in the printer, as well as a number of the properties of some nodes, which led to an inaccurate representation of the source code.
Another point raised by Douglas was that the output should be suitable not for debugging CLang itself (which is what the
-emit-ast
is about) but for consumption by external tools. This requires the output to be stable from one version to another. Notably it should not be a 1-on-1 mapping of CLang internal, but rather translate the source code into standarized language.Unless there is significant work on the printer (which requires volunteers) it will not be integrated back...
我一直在研究自己的从 Clang 的 AST 中提取 XML 的版本。我的代码使用 libclang 的 Python 绑定来遍历 AST。
我的代码位于 https://github.com/BentleyJOakes/PCX
编辑:我应该补充一点在为每个 AST 节点生成正确的源代码标记方面相当不完整。不幸的是,这需要针对每个 AST 节点类型进行编码。然而,该代码应该为任何想要进一步追求这一目标的人提供基础。
I've been working on my own version of extracting XML from Clang's AST. My code uses the Python bindings of libclang in order to traverse the AST.
My code is found at https://github.com/BentleyJOakes/PCX
Edit: I should add that it is quite incomplete in terms of producing the right source code tokens for each AST node. This unfortunately needs to be coded for each AST node type. However, the code should give a basis for anyone who wants to pursue this further.
使用自定义 ASTDumper 即可完成这项工作,而无需 ofc 编译任何源文件。 (在前端部分停止 clang)。但你必须处理 llvm 的所有 C 和 C++ 代码源才能完成此任务。
Using a custom ASTDumper would do the job, without ofc compiling any source file. (stop clang in the frontend part). but you have to deal with all C and C++ code sources of llvm to accomplish that .