ANTLR 语法不处理我的“不”正确操作

发布于 2024-12-03 13:11:16 字数 2421 浏览 2 评论 0原文

我正在尝试解析一种小型表达式语言(我没有从供应商处定义该语言),一切都很好,直到我尝试使用 not 运算符,该运算符是该语言中的波浪号。

我的语法深受这两个链接(又名无耻剪切和粘贴)的影响:

http:// /www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx http://www.alittlemadness.com/2006 /06/05/antlr-by-example-part-1-the-language

该语言由三种表达式类型组成,可以与 and、or、not 运算符和括号一起使用来更改优先级。表达式为:

Skill("name") > some_number (can also be <, >=, <=,  =, !=)
SkillExists("name")
LoggedIn("name") (this one can also have name@name)

此输入工作正常:

Skill("somename") > 1 | (LoggedIn("somename") & SkillExists("othername"))

但是,一旦我尝试使用 not 运算符,我就会得到 NoViableAltException。我不明白为什么。我将我的语法与 codeproject.com 链接上的 ECalc.g 语法进行了比较,它们似乎匹配,一定存在一些我看不到的细微差别。失败:

Skill("somename") < 10 ~ SkillExists("othername")

我的语法:

grammar UserAttribute;

options {
output=AST;
ASTLabelType=CommonTree;
}

tokens {
SKILL = 'Skill' ;
SKILL_EXISTS = 'SkillExists' ;
LOGGED_IN = 'LoggedIn';
GT = '>';
LT = '<';
LTE = '<=';
GTE = '>=';
EQUALS = '=';
NOT_EQUALS = '!=';  
AND = '&';
OR = '|' ;
NOT = '~';
LPAREN   = '(';
RPAREN = ')';
QUOTE = '"';
AT = '@';       
}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/  
expression : orexpression EOF!; 
orexpression    : andexpression (OR^ andexpression)*;
andexpression   : notexpression (AND^ notexpression)*;  
notexpression : primaryexpression | NOT^ primaryexpression;
primaryexpression : term | LPAREN! orexpression RPAREN!;
term    : skill_exists | skill | logged_in;
skill_exists    : SKILL_EXISTS LPAREN QUOTE NAME QUOTE RPAREN;
logged_in : LOGGED_IN LPAREN QUOTE NAME (AT NAME)? QUOTE RPAREN;
skill:  SKILL LPAREN QUOTE NAME QUOTE RPAREN ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
NAME    : ('a'..'z' | 'A'..'Z' | '_')+;
NUMBER  : ('0'..'9')+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = HIDDEN; } ;

I am trying to parse a small expression language (I didn't define the language, from a vendor) and everything is fine until I try to use the not operator, which is a tilde in this language.

My grammar has been heavily influenced by these two links (aka shameless cut and pasting):

http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language

The language consists of three expression types that can be used with and, or, not operators and parenthesis change precedence. Expressions are:

Skill("name") > some_number (can also be <, >=, <=,  =, !=)
SkillExists("name")
LoggedIn("name") (this one can also have name@name)

This input works fine:

Skill("somename") > 1 | (LoggedIn("somename") & SkillExists("othername"))

However, as soon as I try to use the not operator I get NoViableAltException. I can't figure out why. I have compared my grammar to the ECalc.g one at the codeproject.com link and they seem to match, there must be some subtle difference I can't see. Fails:

Skill("somename") < 10 ~ SkillExists("othername")

My Grammar:

grammar UserAttribute;

options {
output=AST;
ASTLabelType=CommonTree;
}

tokens {
SKILL = 'Skill' ;
SKILL_EXISTS = 'SkillExists' ;
LOGGED_IN = 'LoggedIn';
GT = '>';
LT = '<';
LTE = '<=';
GTE = '>=';
EQUALS = '=';
NOT_EQUALS = '!=';  
AND = '&';
OR = '|' ;
NOT = '~';
LPAREN   = '(';
RPAREN = ')';
QUOTE = '"';
AT = '@';       
}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/  
expression : orexpression EOF!; 
orexpression    : andexpression (OR^ andexpression)*;
andexpression   : notexpression (AND^ notexpression)*;  
notexpression : primaryexpression | NOT^ primaryexpression;
primaryexpression : term | LPAREN! orexpression RPAREN!;
term    : skill_exists | skill | logged_in;
skill_exists    : SKILL_EXISTS LPAREN QUOTE NAME QUOTE RPAREN;
logged_in : LOGGED_IN LPAREN QUOTE NAME (AT NAME)? QUOTE RPAREN;
skill:  SKILL LPAREN QUOTE NAME QUOTE RPAREN ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
NAME    : ('a'..'z' | 'A'..'Z' | '_')+;
NUMBER  : ('0'..'9')+ ;
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = HIDDEN; } ;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

与他有关 2024-12-10 13:11:16

我有两点备注:

1

由于您正在解析单个表达式(expression : orexpression EOF!;),因此输入 "Skill("somename") < 10 ~ SkillExists("othername" )" 不仅在您的语法中无效,而且在任何表达式解析器(我知道)方面也是无效的。 notexpression 仅采用“右侧”表达式,因此 ~ SkillExists("othername") 是单个表达式,而 Skill("somename") < 10 也是单个表达式。但在这两个单个表达式之间,没有 ORAND 运算符。这与评估表达式 true false 而不是 true | 是一样的。 falsetrue 和 false

简而言之,你的语法不允许:

Skill("somename") < 10 ~ SkillExists("othername")

但允许:

Skill("somename") < 10 & SkillExists("othername")

这对我来说似乎是合乎逻辑的。

2

我不太明白你的 skill 规则(顺便说一句,这是不明确的):

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?
 ;

这意味着运算符是可选的,并且末尾可以有零个或多个数字。这意味着以下输入均有效:

  • Skill("foo") = 10 20
  • Skill("foo") 10 20 30
  • Skill("foo" ) <

也许您的意思是:

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)^ NUMBER)?
 ;

相反? (? 变为 ^ 并且 * 被删除)

如果我只更改该规则并解析输入:

Skill("somename") < 10 & SkillExists("othername")

将创建以下 AST:

在此处输入图像描述

(如您所见,AST 需要更好地形成:即您需要一些重写规则在你的skill_existslogged_inskill 规则)


编辑

,如果您希望连续的表达式具有隐含的 AND之间的标记,执行如下操作:

grammar UserAttribute;

...
tokens {
...
I_AND;     // <- added a token without any text (imaginary token)
AND = '&';
...
}

andexpression
  :  (notexpression -> notexpression) (AND? notexpression -> ^(I_AND $andexpression notexpression))*
  ;  

...

如您所见,由于 AND 现在是可选的,因此不能在重写规则中使用它,但您必须使用虚构的标记 I_AND

如果您现在解析输入:

Skill("somename") < 10 ~ SkillExists("othername")

您将得到以下 AST:

在此处输入图像描述

I have 2 remarks:

1

Since you're parsing single expressions (expression : orexpression EOF!;), the input "Skill("somename") < 10 ~ SkillExists("othername")" is not only invalid in your grammar, but it's invalid in terms of any expression parser (I know of). A notexpression only takes a "right-hand-side" expression, so ~ SkillExists("othername") is a single expression and Skill("somename") < 10 is also a single expression. But in between those two single expression, there's no OR or AND operator. It would be the same as evaluating the expression true false instead of true | false or true and false.

In short, your grammar disallows:

Skill("somename") < 10 ~ SkillExists("othername")

but allows for:

Skill("somename") < 10 & SkillExists("othername")

which seems logical to me.

2

I don't quite understand your skill rule (which is ambiguous, btw):

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)? NUMBER*)?
 ;

This means that the operator is optional and there can be zero or more numbers at the end. This means that the following input are all valid:

  • Skill("foo") = 10 20
  • Skill("foo") 10 20 30
  • Skill("foo") <

Perhaps you meant:

skill
 : SKILL LPAREN QUOTE NAME QUOTE RPAREN 
     ((GT | LT| LTE | GTE | EQUALS | NOT_EQUALS)^ NUMBER)?
 ;

instead? (the ? becomes a ^ and the * is removed)

If I only change that rule and parse the input:

Skill("somename") < 10 & SkillExists("othername")

the following AST is created:

enter image description here

(as you can see, the AST needs to be better formed: i.e. you need some rewrite rules in your skill_exists, logged_in and skill rules)


EDIT

and if you want successive expressions to have implied AND tokens in between, do something like this:

grammar UserAttribute;

...
tokens {
...
I_AND;     // <- added a token without any text (imaginary token)
AND = '&';
...
}

andexpression
  :  (notexpression -> notexpression) (AND? notexpression -> ^(I_AND $andexpression notexpression))*
  ;  

...

As you can see, since the AND is now optional, it cannot be used inside a rewrite rule, but you'll have to use the imaginary token I_AND.

If you now parse the input:

Skill("somename") < 10 ~ SkillExists("othername")

you will get the following AST:

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文