ANTLR AST 规则失败并出现 RewriteEmptyStreamException

发布于 2024-08-30 02:11:48 字数 1189 浏览 4 评论 0原文

我有一个简单的语法:

grammar sample;
options { output = AST; }
assignment
    : IDENT ':=' expr ';'
    ;
expr    
    : factor ('*' factor)*
    ;
factor
    : primary ('+' primary)*
    ;
primary
    : NUM
    | '(' expr ')'
    ;
IDENT : ('a'..'z')+ ;
NUM   : ('0'..'9')+ ;
WS    : (' '|'\n'|'\t'|'\r')+ {$channel=HIDDEN;} ;

现在我想添加一些重写规则来生成 AST。根据我在网上和《语言模式》书中阅读的内容,我应该能够像这样修改语法:

assignment
    : IDENT ':=' expr ';'   -> ^(':=' IDENT expr)
    ;
expr    
    : factor ('*' factor)* -> ^('*' factor+)
    ;
factor  
    : primary ('+' primary)* -> ^('+' primary+)
    ;
primary
    : NUM
    | '(' expr ')' -> ^(expr)
    ;

但它不起作用。虽然它编译得很好,但当我运行解析器时,我收到 RewriteEmptyStreamException 错误。这就是事情变得奇怪的地方。

如果我定义伪标记 ADD 和 MULT 并使用它们而不是树节点文字,则它可以正常工作而不会出现错误。

tokens { ADD; MULT; }

expr    
    : factor ('*' factor)* -> ^(MULT factor+)
    ;
factor  
    : primary ('+' primary)* -> ^(ADD primary+)
    ;

或者,如果我使用节点后缀表示法,它似乎也可以正常工作:

expr    
    : factor ('*'^ factor)*
    ;
factor  
    : primary ('+'^ primary)*
    ;

这种行为差异是否是一个错误?

I have a simple grammar:

grammar sample;
options { output = AST; }
assignment
    : IDENT ':=' expr ';'
    ;
expr    
    : factor ('*' factor)*
    ;
factor
    : primary ('+' primary)*
    ;
primary
    : NUM
    | '(' expr ')'
    ;
IDENT : ('a'..'z')+ ;
NUM   : ('0'..'9')+ ;
WS    : (' '|'\n'|'\t'|'\r')+ {$channel=HIDDEN;} ;

Now I want to add some rewrite rules to generate an AST. From what I've read online and in the Language Patterns book, I should be able to modify the grammar like this:

assignment
    : IDENT ':=' expr ';'   -> ^(':=' IDENT expr)
    ;
expr    
    : factor ('*' factor)* -> ^('*' factor+)
    ;
factor  
    : primary ('+' primary)* -> ^('+' primary+)
    ;
primary
    : NUM
    | '(' expr ')' -> ^(expr)
    ;

But it does not work. Although it compiles fine, when I run the parser I get a RewriteEmptyStreamException error. Here's where things get weird.

If I define the pseudo tokens ADD and MULT and use them instead of the tree node literals, it works without error.

tokens { ADD; MULT; }

expr    
    : factor ('*' factor)* -> ^(MULT factor+)
    ;
factor  
    : primary ('+' primary)* -> ^(ADD primary+)
    ;

Alternatively, if I use the node suffix notation, it also appears to work fine:

expr    
    : factor ('*'^ factor)*
    ;
factor  
    : primary ('+'^ primary)*
    ;

Is this discrepancy in behavior a bug?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

北城半夏 2024-09-06 02:11:48

不,不是错误,据我所知。以您的 expr 规则为例:

expr    
    : factor ('*' factor)* -> ^('*' factor+)
    ;

由于 * 可能不存在,因此它也不应该出现在您的 AST 重写规则中。因此,上述内容是不正确的,而 ANTLR 的抱怨是正确的。

现在,如果您插入一个像 MULT 这样的虚构标记:

expr    
    : factor ('*' factor)* -> ^(MULT factor+)
    ;

一切都可以,因为您的规则将始终生成一个或多个因子

您可能想做的是这样的:

expr    
    :  (factor -> factor) ('*' f=factor -> ^('*' $expr $f))*
    ;

另请参阅 权威的 ANTLR 参考。特别是子规则中的重写规则(第173页)和在重写规则中引用先前的规则AST(第174/175页)段落。

No, not a bug, AFAIK. Take your expr rule for example:

expr    
    : factor ('*' factor)* -> ^('*' factor+)
    ;

since the * might not be present, it should also not be in your AST rewrite rule. So, the above is incorrect and ANTLR complaining about it is correct.

Now if you insert an imaginary token like MULT instead:

expr    
    : factor ('*' factor)* -> ^(MULT factor+)
    ;

all is okay since your rule will always produce one or more factor's.

What you probably meant to do is something like this:

expr    
    :  (factor -> factor) ('*' f=factor -> ^('*' $expr $f))*
    ;

Also see chapter 7: Tree Construction from The Definitive ANTLR Reference. Especially the paragraphs Rewrite Rules in Subrules (page 173) and Referencing Previous Rule ASTs in Rewrite Rules (page 174/175).

蓝天 2024-09-06 02:11:48

如果您想为 '*' 运算符生成一个 N 叉树,并且所有子节点都处于同一级别,您可以这样做:

expr
    : (s=factor -> factor) (('*' factor)+ -> ^('*' $s factor+))?
    ;

以下是返回结果的一些示例:

Tokens: AST
factor: factor
factor '*' factor: ^('*' factor factor)
factor '*' factor '*' factor: ^('*' factor factor factor)

Bart 上面的第三个示例将生成一个嵌套树,因为每次连续迭代的 $expr 的结果是一个具有两个子节点的节点,如下所示:

factor * factor * factor: ^('*' factor ^('*' factor factor))

您可能不需要它,因为乘法是可交换的。

If you want to generate an N-ary tree for the '*' operator with all children at the same level you can do this:

expr
    : (s=factor -> factor) (('*' factor)+ -> ^('*' $s factor+))?
    ;

Here are some examples of what this will return:

Tokens: AST
factor: factor
factor '*' factor: ^('*' factor factor)
factor '*' factor '*' factor: ^('*' factor factor factor)

Bart's third example above will produce a nested tree, since the result of $expr for each successive iteration is a node with two children, like this:

factor * factor * factor: ^('*' factor ^('*' factor factor))

which you probably don't need since multiplication is commutative.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文