解析多个C源文件
我有多个 C 源文件和各自的头文件。我正在尝试使用编译器(例如 ANTLR)解析这些文件。 定义头文件
@parser::includes
{#include"a.h"}
在ANTLR解析器语法中,您可以使用您可以开始解析第一个文件eg来
CommonTree tree = Parser.start("a.c");
,解析器将解析头文件,
a.h
但是如果您有多个源文件egbc、cc等及其各自的标头,则如何解析文件文件。
I have multiple C source files and respective header files. I am trying to parse these files using a compiler, e.g. ANTLR.
In ANTLR parser grammar, you can define your header files using the
@parser::includes
{#include"a.h"}
You can start parsing the first file e.g.
CommonTree tree = Parser.start("a.c");
and parser will parse the header file
a.h
but how to parse the files if you have multiple source file e.g. b.c, c.c and so on with their respective header files.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
C 很难解析——标记的语义类型取决于它被声明的内容。考虑:
如果 T 是类型名称,那么这是变量声明。如果它是一个标识符,那么它就是一个函数调用。为了解决这个问题,任何期望实际工作的 C 解析器都必须保持完整的类型环境,这意味着它需要是 C 编译器的一个令人不快的大块。
C 语言有 ANTLR 解析器,可以正确处理所有这些内容,但它们使用起来并不简单,而且我对它们没有任何经验,所以不能在那里发表评论。
相反,您可能想考虑使用外部工具将 C 语言解析为更容易处理的内容。 gcc-xml 就是其中之一;它使用 gcc 本身来解析源文件,然后输出更容易处理的 XML。
C is a pig to parse --- the semantic type of a token depends on what it's been declared as. Consider:
If T is a type name, then this is a variable declaration. If it's an identifier, it's a functional call. In order to resolve this, any C parser that expects to actually work is going to have to keep a full type environment, which means it needs to be an unpleasantly large chunk of a C compiler.
There are ANTLR parsers for C that get all this stuff right but they're not trivial to use and I don't have any experience of them, so can't comment there.
Instead you might want to go look at using external tools to parse your C into something that's easier to deal with. gcc-xml is one such; it uses gcc itself to parse source files and then spit out XML that's much easier to handle.