使用 ANTLR 构建树
删除左递归
E -> E + T|T T -> T * F|F F -> INT | ( E )
正如 在 ANTLR 中删除左递归中询问和回答的那样,我可以在之后 左递归去除,我得到以下一个
E -> TE' E' -> null | + TE' T -> FT' T' -> null | * FT'
那么,如何用修改后的语法进行树构造呢? 输入1+2,我想要一棵树
^('+' ^(INT 1) ^(INT 2)). Or similar.
grammar T; options { output=AST; language=Python; ASTLabelType=CommonTree; } start : e -> e ; e : t ep -> ??? ; ep : | '+' t ep -> ??? ; t : f tp -> ??? ; tp : | '*' f tp -> ??? ; f : INT | '(' e ')' -> e ; INT : '0'..'9'+ ; WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
As asked and answered in Removing Left Recursion in ANTLR , I could remove the left recursion
E -> E + T|T T -> T * F|F F -> INT | ( E )
After left recursion removal, I get the following one
E -> TE' E' -> null | + TE' T -> FT' T' -> null | * FT'
Then, how to make the tree construction with the modified grammar?
With the input 1+2, I want to have a tree
^('+' ^(INT 1) ^(INT 2))
. Or similar.
grammar T; options { output=AST; language=Python; ASTLabelType=CommonTree; } start : e -> e ; e : t ep -> ??? ; ep : | '+' t ep -> ??? ; t : f tp -> ??? ; tp : | '*' f tp -> ??? ; f : INT | '(' e ')' -> e ; INT : '0'..'9'+ ; WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一点意见:虽然有时可以从 LR 语法转到 LL 语法,正如您所做的那样,但结果并不那么惯用,并且对于熟悉 LL 语法的人来说,定义语法的方式可能看起来很奇怪。
例如,考虑上面的以下摘录:
上面接受
*
后跟f
,其 FIRST 集将包含INT
或(
,其自身的开头是其右递归。因此,您永远不会看到您想要以*
为根的表达式的开头,这将使它比它需要的困难得多为了能够轻松地在 ANTLR 中创建 AST,您需要同时拥有操作数和运算符,
^
构成+ 树的根,两个 INT 成为其子节点。
示例 Bart K linked to 是一个很好的示例,说明了我希望如何使用 LL 语法完成它......并且它可以扩展以支持不同优先级的运算符。
A bit of opinion: although it's sometimes possible to go from an LR grammar to an LL grammar, as you have done, the result isn't as idiomatic and may seem like a strange way to define your grammar to someone familiar with LL grammars.
For example, consider the following excerpt from above:
The above accepts a
*
followed by anf
whose FIRST set will containINT
or(
, the start of itself as its right recursive. Thus, you'll never see the start of the expression you want rooted at*
, which will make it much more difficult than it needs to be to build the tree you want.To make it easy to create that AST in ANTLR, you want to have both the operands and the operator.
The caret,
^
makes the+
the root of the tree and the twoINT
s become its children.The example Bart K linked to is a great example of how I'd expect to see it done with an LL grammar ... and it scales to support operators of different precedence.