帮助对语法进行左分解以删除左递归
我有一个小型的自定义脚本语言,我正在尝试更新它以允许布尔表达式,例如 a > 2
和 a> 2和(b<3或c>5)
。我在这里遇到麻烦的是括号表达式。
这是(根据 @Bart Kiers 的答案对原始帖子进行编辑的)完整语法,展示了该问题。这是我实际语法的精简版本,但问题也出现在这里。
grammar test;
options {
language = 'JavaScript';
output = AST;
}
statement
: value_assignment_statement
EOF
;
value_assignment_statement
: IDENT
'='
expression
;
value_expression
: value_list_expression
| IDENT
;
value_list_expression
: value_enumerated_list
;
value_enumerated_list : '{' unary+ '}'
;
term
: LPAREN expression RPAREN
| INTEGER
| value_expression
;
unary : ( '+' | '-' )* term
;
mult : unary ( ('*' | '/') unary)*
;
expression : mult ( ('+' | '-') mult )*
;
boolean
: boolean_expression
EOF
;
boolean_expression
: boolean_or_expression
;
boolean_or_expression
: boolean_and_expression (OR boolean_and_expression)*
;
boolean_and_expression
: boolean_rel_expression (AND boolean_rel_expression)*
;
boolean_rel_expression
: boolean_neg_expression relational_operator boolean_neg_expression
;
boolean_neg_expression
: (NOT)? atom
;
atom
: LPAREN boolean_expression RPAREN
//| expression
;
relational_operator : '=' | '>' | '<';
LPAREN : '(';
RPAREN : ')';
AND : 'and';
OR : 'or';
NOT : 'not';
IDENT : LETTER LETTER+;
INTEGER : DIGIT+;
WS : (' ' | '\n' | '\r' | '\t')+ { $channel = HIDDEN; };
fragment DIGIT : '0'..'9';
fragment LETTER : ('a'..'z' | 'A'..'Z');
我尝试容纳括号布尔表达式,例如 a > 2 或 (b < 3)
位于 atom
规则中的注释行中。当我取消注释这一行并将其包含在语法中时,ANTLR 给出以下错误:
[致命]规则原子由于可从 alts 1,2 到达的递归规则调用而具有非 LL(*) 决策。通过左因子分解或使用语法谓词或使用 backtrack=true 选项进行解析。
我想通过删除递归来解决这个问题,但我似乎无法从 Wikipedia 描述进行转换关于如何删除我自己的东西的左递归。
在使用此语法时,我有时希望使用 statement
作为输入的根,例如 abc = 2 + 3
,它将一个值分配给名为 abc 的变量。其他时候,我想使用语法来计算以 boolean
作为根的表达式,输入如 abc > > 3且(xyz<5或xyz>10)
。当我尝试使用 @Bart 的答案作为模型时,它工作得很好,直到我尝试将 statement
使用的语法部分与 boolean
使用的部分合并。他们都应该能够使用表达式
,但这就是我遇到左递归错误的地方。
那么,如何既处理括号又避免左递归问题呢?
I have a small custom scripting language, and I am trying to update it to allow boolean expressions such as a > 2
and a > 2 and (b < 3 or c > 5)
. It's the parenthetical expressions that I am having trouble with here.
Here is a (edited since the original post based on the answer from @Bart Kiers) full grammar that exhibits the problem. This is a pared-down version of my actual grammar, but the problem occurs here too.
grammar test;
options {
language = 'JavaScript';
output = AST;
}
statement
: value_assignment_statement
EOF
;
value_assignment_statement
: IDENT
'='
expression
;
value_expression
: value_list_expression
| IDENT
;
value_list_expression
: value_enumerated_list
;
value_enumerated_list : '{' unary+ '}'
;
term
: LPAREN expression RPAREN
| INTEGER
| value_expression
;
unary : ( '+' | '-' )* term
;
mult : unary ( ('*' | '/') unary)*
;
expression : mult ( ('+' | '-') mult )*
;
boolean
: boolean_expression
EOF
;
boolean_expression
: boolean_or_expression
;
boolean_or_expression
: boolean_and_expression (OR boolean_and_expression)*
;
boolean_and_expression
: boolean_rel_expression (AND boolean_rel_expression)*
;
boolean_rel_expression
: boolean_neg_expression relational_operator boolean_neg_expression
;
boolean_neg_expression
: (NOT)? atom
;
atom
: LPAREN boolean_expression RPAREN
//| expression
;
relational_operator : '=' | '>' | '<';
LPAREN : '(';
RPAREN : ')';
AND : 'and';
OR : 'or';
NOT : 'not';
IDENT : LETTER LETTER+;
INTEGER : DIGIT+;
WS : (' ' | '\n' | '\r' | '\t')+ { $channel = HIDDEN; };
fragment DIGIT : '0'..'9';
fragment LETTER : ('a'..'z' | 'A'..'Z');
My attempt to accommodate parenthetical boolean expressions such as a > 2 or (b < 3)
is in the commented-out line in the atom
rule. When I uncomment this line and include it in the grammar, ANTLR gives me this error:
[fatal] rule atom has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.
I would like to address this by removing the recursion, but I can't seem to make the transition from the Wikipedia description on how to remove left recursion to my own stuff.
In using this grammar, I want sometimes to use statement
as a root with input such as abc = 2 + 3
, which assigns a value to a variable named abc. Other times I want to use the grammar to evaluate an expression with boolean
as the root with input such as abc > 3 and (xyz < 5 or xyz > 10)
. When I tried to use @Bart's answer as a model, it worked fine until I tried to merge the parts of the grammar used by statement
with the parts used by boolean
. They should both be able to use an expression
, but that's where I'm stuck with this left recursion error.
So, how can I both handle parentheses and avoid the left recursion problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
布尔表达式与加法和乘法表达式相同,因此不应将它们分开。以下是如何考虑所有类型的表达式:
它将把示例输入:解析
到以下解析树中:
Boolean expressions are just the same as the additive- and multiplicative expression, and should therefore not be separated from them. Here's how to account for all types of expressions:
which will parse the example input:
into the following parse tree: