Antlr 左递归
我正在尝试将后缀、中缀和前缀规则从 EBNF 形式的 scala 转换为 ANTLR,但在 infixExpression 规则上看到与左递归相关的错误。
有问题的规则是:
public symbolOrID
: ID
| Symbol
;
public postfixExpression
: infixExpression symbolOrID? -> ^(R__PostfixExpression infixExpression symbolOrID?)
;
public infixExpression
: prefixExpression
| infixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression infixExpression symbolOrID? infixExpression?)
;
public prefixExpression
: prefixCharacter? simpleExpression -> ^(R__PrefixExpression prefixCharacter? simpleExpression)
;
public prefixCharacter
: '-' | '+' | '~' | '!' | '#'
;
public simpleExpression
: constant
;
如果我将 infixExpression 规则更改为:
public infixExpression
: prefixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression prefixExpression symbolOrID? infixExpression?)
;
那么它会抱怨:
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} String" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Number" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Boolean" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Regex" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Null" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
最后,是否有一种方法可以有条件地在 AST 中创建节点,这样如果只有规则的左侧部分为 true,则它不会添加该级别?例如:
conditional_or_expression:
conditional_and_expression ('||' conditional_or_expression)?
;
假设我创建的语法遵循如下层次结构:
conditional_and_expression
conditional_or_expression
null_coalescing_expression
如果解析的表达式是 a || b
,目前为该表达式创建的 AST 将是
conditional_and_expression
conditional_or_expression
如何获取它,以便它只获取 conditional_or_expression
部分?
在JavaCC中,你可以只设置节点数量,例如:#ConditionalOrExpression(>1)
编辑:昨晚有点晚了,中缀表达式现在已经正确修改了!
最终编辑:我最终让它发挥作用的方式是以下规则:
public symbolOrID
: ID
| Symbol
;
public postfixExpression
: infixExpression (symbolOrID^)?
;
public infixExpression
: (prefixExpression symbolOrID)=> prefixExpression symbolOrID^ infixExpression
| prefixExpression
;
public prefixExpression
: prefixCharacter^ simpleExpression
| simpleExpression
;
public prefixCharacter
: '-' | '+' | '~' | '!' | '#'
;
public simpleExpression
: constant
;
I'm trying to convert the postfix, infix and prefix rules from scala in EBNF form to ANTLR but am seeing an error relating to left-recursion on the infixExpression rule.
The rules in question are:
public symbolOrID
: ID
| Symbol
;
public postfixExpression
: infixExpression symbolOrID? -> ^(R__PostfixExpression infixExpression symbolOrID?)
;
public infixExpression
: prefixExpression
| infixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression infixExpression symbolOrID? infixExpression?)
;
public prefixExpression
: prefixCharacter? simpleExpression -> ^(R__PrefixExpression prefixCharacter? simpleExpression)
;
public prefixCharacter
: '-' | '+' | '~' | '!' | '#'
;
public simpleExpression
: constant
;
If I change the infixExpression rule to:
public infixExpression
: prefixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression prefixExpression symbolOrID? infixExpression?)
;
Then it instead complains:
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} String" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Number" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Boolean" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Regex" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Null" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
Lastly, is there a way to conditionally create the nodes in the AST so that if only the left part of the rule is true then it doesn't add that level in? E.g.:
conditional_or_expression:
conditional_and_expression ('||' conditional_or_expression)?
;
As in, lets say I create the grammar which follows a hierarchy like:
conditional_and_expression
conditional_or_expression
null_coalescing_expression
if the expresion that is parsed is a || b
, currently the AST that is created is for this expression would be
conditional_and_expression
conditional_or_expression
How could I get it so it just gets the conditional_or_expression
part?
In JavaCC, you could just set the node arity, e.g.: #ConditionalOrExpression(>1)
EDIT: it was a bit late last night, infix expression is now propery modified!
Final edit: The way I got it to work in the end were the following rules:
public symbolOrID
: ID
| Symbol
;
public postfixExpression
: infixExpression (symbolOrID^)?
;
public infixExpression
: (prefixExpression symbolOrID)=> prefixExpression symbolOrID^ infixExpression
| prefixExpression
;
public prefixExpression
: prefixCharacter^ simpleExpression
| simpleExpression
;
public prefixCharacter
: '-' | '+' | '~' | '!' | '#'
;
public simpleExpression
: constant
;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如我所说在我的评论中:您发布的规则中没有左递归。
我假设您正在使用 ANTLRWorks 的解释器或调试器,在这种情况下,树:
仅像这样显示(显示解析树,而不是 AST)。如果您正确地将
orExpression
转换为 AST,则表达式a || b
将变为:(即
||
作为根节点,a
和b
作为子节点)例如,采用以下语法:
如果您现在使用从上述语法生成的解析器解析
12+34
,ANTLRWorks(或 Eclipse ANTLR IDE)将显示以下解析树:但这不是解析器创建的 AST。 AST 实际上看起来像:
(即
or_expr
,< code>and_expr“层”不在那里)没问题,但您必须意识到,如果您隐瞒重要信息,人们将无法正确回答您的问题。您不需要发布整个语法,但如果您需要左递归方面的帮助,您必须发布实际导致您提到的错误的(部分)语法。如果我不能复制它,它就不存在! :)
As I said in my comment: there's no left recursion in the rules you posted.
I'm assuming you're using ANTLRWorks' interpreter or debugger, in which case the tree:
is only being displayed like that (the parse tree is shown, not the AST). If you properly transform your
orExpression
into an AST, the expressiona || b
will become:(i.e.
||
as root, anda
andb
as child nodes)For example, take the following grammar:
If you now parse
12+34
with a parser generated from the grammar above, ANTLRWorks (or the Eclipse ANTLR IDE) will show the following parse tree:but this is not the AST the parser creates. The AST actually looks like:
(i.e. the
or_expr
,and_expr
"layers" are not in there)No problem, but you must realize that people can't answer your questions properly if you withhold crucial information. You don't need to post the entire grammar, but if you want help with the left-recursion, you must post a (partial) grammar that actually causes the error(s) you mention. If I can't reproduce it, it doesn't exist! :)
这个产生式:
可以重写为
事实上,我敢打赌这只是语法中的一个错误。我们举个例子说明一下是可以的。让我们用第一个语法(部分地)减少一些东西,然后尝试第二个语法。
让我们用第二个语法来简化它:
如您所见,在两种情况下都以等效的 AST 结束。
This production:
Can be rewritten as
In fact, I bet this is just an error in the grammar. Let's show an example that it is ok. Let's reduce (partially) something with the first grammar, and then try the second one.
Let's reduce it with the second grammar:
As you see, you end with equivalent ASTs in both cases.