使用 ANTLR 构建树

发布于 2024-09-05 12:06:30 字数 742 浏览 3 评论 0原文

删除左递归

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

正如 在 ANTLR 中删除左递归中询问和回答的那样,我可以在之后 左递归去除,我得到以下一个

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

那么,如何用修改后的语法进行树构造呢? 输入1+2,我想要一棵树

^('+' ^(INT 1) ^(INT 2))
. Or similar.

grammar T;

options {
    output=AST;
    language=Python;
    ASTLabelType=CommonTree;
}

start : e -> e
   ;
e  : t ep -> ???
   ;
ep : 
   | '+' t ep -> ???
   ;
t : f tp -> ???
  ;
tp : 
  | '*' f tp -> ???
  ;
f : INT 
  | '(' e ')' -> e
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

As asked and answered in Removing Left Recursion in ANTLR , I could remove the left recursion

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

After left recursion removal, I get the following one

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

Then, how to make the tree construction with the modified grammar?
With the input 1+2, I want to have a tree

^('+' ^(INT 1) ^(INT 2))

. Or similar.

grammar T;

options {
    output=AST;
    language=Python;
    ASTLabelType=CommonTree;
}

start : e -> e
   ;
e  : t ep -> ???
   ;
ep : 
   | '+' t ep -> ???
   ;
t : f tp -> ???
  ;
tp : 
  | '*' f tp -> ???
  ;
f : INT 
  | '(' e ')' -> e
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谁人与我共长歌 2024-09-12 12:06:30

一点意见:虽然有时可以从 LR 语法转到 LL 语法,正如您所做的那样,但结果并不那么惯用,并且对于熟悉 LL 语法的人来说,定义语法的方式可能看起来很奇怪。

例如,考虑上面的以下摘录:

tp : 
  | '*' f tp -> ???

上面接受 * 后跟 f,其 FIRST 集将包含 INT(,其自身的开头是其右递归。因此,您永远不会看到您想要以 * 为根的表达式的开头,这将使它比它需要的困难得多为了能够

轻松地在 ANTLR 中创建 AST,您需要同时拥有操作数和运算符,

add:
   INT '+'^ INT;

^ 构成 + 树的根,两个 INT 成为其子节点。

示例 Bart K linked to 是一个很好的示例,说明了我希望如何使用 LL 语法完成它......并且它可以扩展以支持不同优先级的运算符。

A bit of opinion: although it's sometimes possible to go from an LR grammar to an LL grammar, as you have done, the result isn't as idiomatic and may seem like a strange way to define your grammar to someone familiar with LL grammars.

For example, consider the following excerpt from above:

tp : 
  | '*' f tp -> ???

The above accepts a * followed by an f whose FIRST set will contain INT or (, the start of itself as its right recursive. Thus, you'll never see the start of the expression you want rooted at *, which will make it much more difficult than it needs to be to build the tree you want.

To make it easy to create that AST in ANTLR, you want to have both the operands and the operator.

add:
   INT '+'^ INT;

The caret, ^ makes the + the root of the tree and the two INTs become its children.

The example Bart K linked to is a great example of how I'd expect to see it done with an LL grammar ... and it scales to support operators of different precedence.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文