libclang 解析生成错误的输出

发布于 2024-10-27 19:25:37 字数 1758 浏览 5 评论 0原文

我正在尝试使用 libclang 构建一个小型解析程序。

要解析的源文件 (Node.h):

#pragma once

struct Node {
    int value;
    struct Node *next;
};

主程序简单地调用 clang 解析器并遍历 AST 中的所有元素:

int main(int argc, char *argv[]) {
    CXIndex index = clang_createIndex(0, 0);

    const char *filename = "Node.h";

    CXTranslationUnit TU = clang_parseTranslationUnit(index, filename, NULL, 0, NULL, 0, CXTranslationUnit_None);

    CXCursor rootCursor = clang_getTranslationUnitCursor(TU);

    clang_visitChildren(rootCursor, printVisitor, NULL);

    clang_disposeTranslationUnit(TU);
    clang_disposeIndex(index);
    return 0;
}

访问者:

CXChildVisitResult printVisitor(CXCursor cursor, CXCursor parent, CXClientData client_data) {

    CXSourceRange range = clang_getCursorExtent(cursor);
    CXSourceLocation startLocation = clang_getRangeStart(range);
    CXSourceLocation endLocation = clang_getRangeEnd(range);

    CXFile file;
    unsigned int line, column, offset;
    clang_getInstantiationLocation(startLocation, &file, &line, &column, &offset);
    printf("Start: Line: %u Column: %u Offset: %u\n", line, column, offset);
    clang_getInstantiationLocation(endLocation, &file, &line, &column, &offset);
    printf("End: Line: %u Column: %u Offset: %u\n", line, column, offset); 

    return CXChildVisit_Recurse;
}

但是,输出显示了一些奇怪的部分:

Start: Line: 99 Column: 9 Offset: 3160 
End: Line: 99 Column: 122 Offset: 3273 
Kind: A field (in C) or non-static data member (in C++) in a struct.
Filename: (null)

这是从哪里来的?

删除该编译指示后,没有任何变化。对于要解析的完全空的头文件也会发生同样的情况。

我是否必须绕过 AST 中找到的所有节点,直到获得“第一个语句”或“第一个表达式”节点?

I am trying to build a small parsing program using libclang.

The source file to parse (Node.h):

#pragma once

struct Node {
    int value;
    struct Node *next;
};

The main program simple invokes the clang parser and walks all elements in the AST:

int main(int argc, char *argv[]) {
    CXIndex index = clang_createIndex(0, 0);

    const char *filename = "Node.h";

    CXTranslationUnit TU = clang_parseTranslationUnit(index, filename, NULL, 0, NULL, 0, CXTranslationUnit_None);

    CXCursor rootCursor = clang_getTranslationUnitCursor(TU);

    clang_visitChildren(rootCursor, printVisitor, NULL);

    clang_disposeTranslationUnit(TU);
    clang_disposeIndex(index);
    return 0;
}

The visitor:

CXChildVisitResult printVisitor(CXCursor cursor, CXCursor parent, CXClientData client_data) {

    CXSourceRange range = clang_getCursorExtent(cursor);
    CXSourceLocation startLocation = clang_getRangeStart(range);
    CXSourceLocation endLocation = clang_getRangeEnd(range);

    CXFile file;
    unsigned int line, column, offset;
    clang_getInstantiationLocation(startLocation, &file, &line, &column, &offset);
    printf("Start: Line: %u Column: %u Offset: %u\n", line, column, offset);
    clang_getInstantiationLocation(endLocation, &file, &line, &column, &offset);
    printf("End: Line: %u Column: %u Offset: %u\n", line, column, offset); 

    return CXChildVisit_Recurse;
}

However, the output shows some weird parts:

Start: Line: 99 Column: 9 Offset: 3160 
End: Line: 99 Column: 122 Offset: 3273 
Kind: A field (in C) or non-static data member (in C++) in a struct.
Filename: (null)

Where does this come from?

When removing the pragma, nothing changes. The same happens with a completely empty header file to parse.

Do I have to bypass all found node in the AST until I get a "first statement"- or "first expression"-node?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

半夏半凉 2024-11-03 19:25:37

我得到这样的 TU:

CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, argv, argc,
0, 0,
CXTranslationUnit_None);

然后,我运行您的测试用例(Node.h),得到结果:

开始:行:3 列:1 偏移:14

结束:行:6 列:2 偏移:67

开始:行:4 列:5 偏移:32

结束:行:4 列:14 偏移量:41

开始:行:5 列:5 偏移量:47

结束:行:5 列:22 偏移量:64

开始:行:5 列:12 偏移量:54

结束:行:5 列: 16 Offset: 58

我认为结果是正确的。你可以这样尝试一下。

I got the TU like this:

CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, argv, argc,
0, 0,
CXTranslationUnit_None);

And, then I run your testcase(Node.h), got the result:

Start: Line: 3 Column: 1 Offset: 14

End: Line: 6 Column: 2 Offset: 67

Start: Line: 4 Column: 5 Offset: 32

End: Line: 4 Column: 14 Offset: 41

Start: Line: 5 Column: 5 Offset: 47

End: Line: 5 Column: 22 Offset: 64

Start: Line: 5 Column: 12 Offset: 54

End: Line: 5 Column: 16 Offset: 58

I think the result is correct. You can try it like this.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文