libclang 解析生成错误的输出
我正在尝试使用 libclang 构建一个小型解析程序。
要解析的源文件 (Node.h):
#pragma once
struct Node {
int value;
struct Node *next;
};
主程序简单地调用 clang 解析器并遍历 AST 中的所有元素:
int main(int argc, char *argv[]) {
CXIndex index = clang_createIndex(0, 0);
const char *filename = "Node.h";
CXTranslationUnit TU = clang_parseTranslationUnit(index, filename, NULL, 0, NULL, 0, CXTranslationUnit_None);
CXCursor rootCursor = clang_getTranslationUnitCursor(TU);
clang_visitChildren(rootCursor, printVisitor, NULL);
clang_disposeTranslationUnit(TU);
clang_disposeIndex(index);
return 0;
}
访问者:
CXChildVisitResult printVisitor(CXCursor cursor, CXCursor parent, CXClientData client_data) {
CXSourceRange range = clang_getCursorExtent(cursor);
CXSourceLocation startLocation = clang_getRangeStart(range);
CXSourceLocation endLocation = clang_getRangeEnd(range);
CXFile file;
unsigned int line, column, offset;
clang_getInstantiationLocation(startLocation, &file, &line, &column, &offset);
printf("Start: Line: %u Column: %u Offset: %u\n", line, column, offset);
clang_getInstantiationLocation(endLocation, &file, &line, &column, &offset);
printf("End: Line: %u Column: %u Offset: %u\n", line, column, offset);
return CXChildVisit_Recurse;
}
但是,输出显示了一些奇怪的部分:
Start: Line: 99 Column: 9 Offset: 3160
End: Line: 99 Column: 122 Offset: 3273
Kind: A field (in C) or non-static data member (in C++) in a struct.
Filename: (null)
这是从哪里来的?
删除该编译指示后,没有任何变化。对于要解析的完全空的头文件也会发生同样的情况。
我是否必须绕过 AST 中找到的所有节点,直到获得“第一个语句”或“第一个表达式”节点?
I am trying to build a small parsing program using libclang.
The source file to parse (Node.h):
#pragma once
struct Node {
int value;
struct Node *next;
};
The main program simple invokes the clang parser and walks all elements in the AST:
int main(int argc, char *argv[]) {
CXIndex index = clang_createIndex(0, 0);
const char *filename = "Node.h";
CXTranslationUnit TU = clang_parseTranslationUnit(index, filename, NULL, 0, NULL, 0, CXTranslationUnit_None);
CXCursor rootCursor = clang_getTranslationUnitCursor(TU);
clang_visitChildren(rootCursor, printVisitor, NULL);
clang_disposeTranslationUnit(TU);
clang_disposeIndex(index);
return 0;
}
The visitor:
CXChildVisitResult printVisitor(CXCursor cursor, CXCursor parent, CXClientData client_data) {
CXSourceRange range = clang_getCursorExtent(cursor);
CXSourceLocation startLocation = clang_getRangeStart(range);
CXSourceLocation endLocation = clang_getRangeEnd(range);
CXFile file;
unsigned int line, column, offset;
clang_getInstantiationLocation(startLocation, &file, &line, &column, &offset);
printf("Start: Line: %u Column: %u Offset: %u\n", line, column, offset);
clang_getInstantiationLocation(endLocation, &file, &line, &column, &offset);
printf("End: Line: %u Column: %u Offset: %u\n", line, column, offset);
return CXChildVisit_Recurse;
}
However, the output shows some weird parts:
Start: Line: 99 Column: 9 Offset: 3160
End: Line: 99 Column: 122 Offset: 3273
Kind: A field (in C) or non-static data member (in C++) in a struct.
Filename: (null)
Where does this come from?
When removing the pragma, nothing changes. The same happens with a completely empty header file to parse.
Do I have to bypass all found node in the AST until I get a "first statement"- or "first expression"-node?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我得到这样的 TU:
CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, argv, argc,
0, 0,
CXTranslationUnit_None);
然后,我运行您的测试用例(Node.h),得到结果:
开始:行:3 列:1 偏移:14
结束:行:6 列:2 偏移:67
开始:行:4 列:5 偏移:32
结束:行:4 列:14 偏移量:41
开始:行:5 列:5 偏移量:47
结束:行:5 列:22 偏移量:64
开始:行:5 列:12 偏移量:54
结束:行:5 列: 16 Offset: 58
我认为结果是正确的。你可以这样尝试一下。
I got the TU like this:
CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, argv, argc,
0, 0,
CXTranslationUnit_None);
And, then I run your testcase(Node.h), got the result:
Start: Line: 3 Column: 1 Offset: 14
End: Line: 6 Column: 2 Offset: 67
Start: Line: 4 Column: 5 Offset: 32
End: Line: 4 Column: 14 Offset: 41
Start: Line: 5 Column: 5 Offset: 47
End: Line: 5 Column: 22 Offset: 64
Start: Line: 5 Column: 12 Offset: 54
End: Line: 5 Column: 16 Offset: 58
I think the result is correct. You can try it like this.