Antlr 中具有固定节点而不是错误节点的 AST

发布于 2024-09-01 03:12:17 字数 1208 浏览 16 评论 0原文

我有一个 antlr 生成的 Java 解析器，它使用 C 目标，并且运行得很好。问题是我还希望它能够解析错误代码并生成有意义的 AST。如果我向它提供一个最小的 Java 类，其中包含一次导入，之后缺少分号，它会生成两个“树错误节点”对象，其中应包含“导入”标记和导入类的标记。

但由于它正确解析以下代码并为此代码生成正确的节点，因此必须通过添加分号或重新同步来从错误中恢复。有没有办法让antlr反映它在AST内部产生的固定输入？或者我至少可以以某种方式获得产生“树节点错误”的标记/文本吗？

在 C 目标中 antlr3commontreeadaptor.c 在第 200 行左右，以下片段表明 C 目标到目前为止仅创建虚拟错误节点：

static  pANTLR3_BASE_TREE
errorNode                               (pANTLR3_BASE_TREE_ADAPTOR adaptor,   pANTLR3_TOKEN_STREAM ctnstream, pANTLR3_COMMON_TOKEN startToken, pANTLR3_COMMON_TOKEN stopToken, pANTLR3_EXCEPTION e)
{
    // Use the supplied common tree node stream to get another tree from the factory
    // TODO: Look at creating the erronode as in Java, but this is complicated by the
    // need to track and free the memory allocated to it, so for now, we just
    // want something in the tree that isn't a NULL pointer.
    //
    return adaptor->createTypeText(adaptor, ANTLR3_TOKEN_INVALID, (pANTLR3_UINT8)"Tree Error Node");
}

我在这里运气不好吗？只有 Java 目标生成的错误节点才允许我检索错误节点的文本？

原文

I have an antlr generated Java parser that uses the C target and it works quite well. The problem is I also want it to parse erroneous code and produce a meaningful AST. If I feed it a minimal Java class with one import after which a semicolon is missing it produces two "Tree Error Node" objects where the "import" token and the tokens for the imported class should be.

But since it parses the following code correctly and produces the correct nodes for this code it must recover from the error by adding the semicolon or by resyncing. Is there a way to make antlr reflect this fixed input it produces internally in the AST? Or can I at least get the tokens/text that produced the "Tree Node Errors" somehow?

In the C targets
antlr3commontreeadaptor.c around line 200 the following fragment indicates that the C target only creates dummy error nodes so far:

static  pANTLR3_BASE_TREE
errorNode                               (pANTLR3_BASE_TREE_ADAPTOR adaptor,   pANTLR3_TOKEN_STREAM ctnstream, pANTLR3_COMMON_TOKEN startToken, pANTLR3_COMMON_TOKEN stopToken, pANTLR3_EXCEPTION e)
{
    // Use the supplied common tree node stream to get another tree from the factory
    // TODO: Look at creating the erronode as in Java, but this is complicated by the
    // need to track and free the memory allocated to it, so for now, we just
    // want something in the tree that isn't a NULL pointer.
    //
    return adaptor->createTypeText(adaptor, ANTLR3_TOKEN_INVALID, (pANTLR3_UINT8)"Tree Error Node");
}

Am I out of luck here and only the error nodes the Java target produces would allow me to retrieve the text of the erroneous nodes?

分享到QQ

分享到微博