2个不同域值中的ANTLR3公共值
我需要为以下搜索条件定义一个语言解析器:
CRITERIA_1=<values-set-#1> AND/OR CRITERIA_2=<values-set-#2>;
其中
可以具有 1-50 之间的值,
可以来自以下集合 (5, A, B, C) - 大小写在这里并不重要。
我决定使用 ANTLR3 (v3.4) 并在 C# (CSharp3) 中输出,到目前为止它工作得相当顺利。问题是,当我提供两个数据集中的值(即本例中为“5”)时,它无法解析字符串。例如,如果我提供以下字符串,
CRITERIA_1=5;
它将返回以下错误,其中值节点应该是:
<unexpected: [@1,11:11='5',<27>,1:11], resync=5>
语法定义文件如下:
grammar ZeGrammar;
options {
language=CSharp3;
TokenLabelType=CommonToken;
output=AST;
ASTLabelType=CommonTree;
k=3;
}
tokens
{
ROOT;
CRITERIA_1;
CRITERIA_2;
OR = 'OR';
AND = 'AND';
EOF = ';';
LPAREN = '(';
RPAREN = ')';
}
public
start
: expr EOF -> ^(ROOT expr)
;
expr
: subexpr ((AND|OR)^ subexpr)*
;
subexpr
: grouppedsubexpr
| 'CRITERIA_1=' rangeval1_expr -> ^(CRITERIA_1 rangeval1_expr)
| 'CRITERIA_2=' rangeval2_expr -> ^(CRITERIA_2 rangeval2_expr)
;
grouppedsubexpr
: LPAREN! expr RPAREN!
;
rangeval1_expr
: rangeval1_subexpr
| RANGE1_VALUES
;
rangeval1_subexpr
: LPAREN! rangeval1_expr (OR^ rangeval1_expr)* RPAREN!
;
RANGE1_VALUES
: (('0'..'4')? ('0'..'9') | '5''0')
;
rangeval2_expr
: rangeval2_subexpr
| RANGE2_VALUES
;
rangeval2_subexpr
: LPAREN! rangeval2_expr (OR^ rangeval2_expr)* RPAREN!
;
RANGE2_VALUES
: '5' | ('a'|'A') | ('b'|'B') | ('c'|'C')
;
如果我从 RANGE2_VALUES 中删除值“5”,则它工作正常。谁能提示我我做错了什么?
I need to define a language-parser for the following search criteria:
CRITERIA_1=<values-set-#1> AND/OR CRITERIA_2=<values-set-#2>;
Where <values-set-#1>
can have values from 1-50 and <values-set-#2>
can be from the following set (5, A, B, C) - case is not important here.
I have decided to use ANTLR3 (v3.4) with output in C# (CSharp3) and it used to work pretty smooth until now. The problem is that it fails to parse the string when I provide values from both data-sets (I.e. in this case '5'). For example, if I provide the following string
CRITERIA_1=5;
It returns the following error where the value node was supposed to be:
<unexpected: [@1,11:11='5',<27>,1:11], resync=5>
The grammar definition file is the following:
grammar ZeGrammar;
options {
language=CSharp3;
TokenLabelType=CommonToken;
output=AST;
ASTLabelType=CommonTree;
k=3;
}
tokens
{
ROOT;
CRITERIA_1;
CRITERIA_2;
OR = 'OR';
AND = 'AND';
EOF = ';';
LPAREN = '(';
RPAREN = ')';
}
public
start
: expr EOF -> ^(ROOT expr)
;
expr
: subexpr ((AND|OR)^ subexpr)*
;
subexpr
: grouppedsubexpr
| 'CRITERIA_1=' rangeval1_expr -> ^(CRITERIA_1 rangeval1_expr)
| 'CRITERIA_2=' rangeval2_expr -> ^(CRITERIA_2 rangeval2_expr)
;
grouppedsubexpr
: LPAREN! expr RPAREN!
;
rangeval1_expr
: rangeval1_subexpr
| RANGE1_VALUES
;
rangeval1_subexpr
: LPAREN! rangeval1_expr (OR^ rangeval1_expr)* RPAREN!
;
RANGE1_VALUES
: (('0'..'4')? ('0'..'9') | '5''0')
;
rangeval2_expr
: rangeval2_subexpr
| RANGE2_VALUES
;
rangeval2_subexpr
: LPAREN! rangeval2_expr (OR^ rangeval2_expr)* RPAREN!
;
RANGE2_VALUES
: '5' | ('a'|'A') | ('b'|'B') | ('c'|'C')
;
And if I remove the value '5' from RANGE2_VALUES
it works fine. Can anyone hint me on what I am doing wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您必须意识到词法分析器不会根据解析器尝试匹配的内容生成标记。因此,在您的情况下,输入
"5"
将始终被标记为RANGE1_VALUES
而不会被标记为RANGE2_VALUES
,因为RANGE1_VALUES
和RANGE2_VALUES
可以匹配此输入,但RANGE1_VALUES
首先出现(因此RANGE1_VALUES
优先于RANGE2_VALUES
)。可能的修复方法是删除
RANGE1_VALUES
和RANGE2_VALUES
规则,并将其替换为以下词法分析器规则:并引入这些新的解析器规则:
并更改所有
RANGE1_VALUES解析器规则中使用
和range1_values
和range2_values
调用RANGE2_VALUES
分别。编辑
您可以简单地匹配任何整数值并使用 检查解析器规则内部的值是否是正确的值(或正确的范围),而不是尝试在词法分析器级别解决此问题语义谓词:
You must realize that the lexer does not produce tokens based on what the parser tries to match. So, in your case, the input
"5"
will always be tokenized as aRANGE1_VALUES
and never as aRANGE2_VALUES
because bothRANGE1_VALUES
andRANGE2_VALUES
can match this input butRANGE1_VALUES
comes first (soRANGE1_VALUES
takes precedence overRANGE2_VALUES
).A possible fix would be to remove both
RANGE1_VALUES
andRANGE2_VALUES
rules and replace them with the following lexer rules:and the introduce these new parser rules:
and change all
RANGE1_VALUES
andRANGE2_VALUES
calls in your parser rules withrange1_values
andrange2_values
respectively.EDIT
Instead of trying to solve this at the lexer-level, you might simply match any integer value and check inside the parser rule if the value is the correct one (or correct range) using a semantic predicate: