有没有办法改进正负整数和小数的 ANTLR 3 语法？

发布于 2024-11-07 17:51:44 字数 663 浏览 8 评论 0原文

有没有办法用可选的正负号以较少重复的方式表达这一点？

我想要完成的是如何在可选地具有指数和/或小数部分的数字文字上表达可选地提供正 + （默认）和负 - 符号。

NUMBER : ('+'|'-')? DIGIT+ '.' DIGIT* EXPONENT?
       | ('+'|'-')? '.'? DIGIT+ EXPONENT?
       ;

fragment 
EXPONENT : ('e' | 'E') ('+' | '-') ? DIGIT+ 
         ;

fragment
DIGIT  : '0'..'9' 
       ;

我希望能够识别 NUMBER 模式，并且当时不太关心这些数字的算术，稍后我会这样做，但我试图了解如何识别 any NUMBER 文字，其中数字如下所示：

123
+123
-123
0.123
+.123
-.123
123.456
+123.456
-123.456
123.456e789
+123.456e789
-123.456e789

以及我没想到包含在此处的任何其他标准格式。

原文

Is there a way to express this in a less repeative fashion with the optional positive and negative signs?

What I am trying to accomplish is how to express optionally provide positive + ( default ) and negative - signs on number literals that optionally have exponents and or decimal parts.

NUMBER : ('+'|'-')? DIGIT+ '.' DIGIT* EXPONENT?
       | ('+'|'-')? '.'? DIGIT+ EXPONENT?
       ;

fragment 
EXPONENT : ('e' | 'E') ('+' | '-') ? DIGIT+ 
         ;

fragment
DIGIT  : '0'..'9' 
       ;

I want to be able to recognize NUMBER patterns, and am not so concerned about arithmetic on those numbers at that point, I will later, but I am trying to understand how to recognize any NUMBER literals where numbers look like:

123
+123
-123
0.123
+.123
-.123
123.456
+123.456
-123.456
123.456e789
+123.456e789
-123.456e789

and any other standard formats that I haven't thought to include here.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

原来是傀儡 2024-11-14 17:51:44

回答你的问题：不，没有办法改进这个据我所知。您可以将 ('+' | '-') 放在片段规则内并使用该片段，就像指数片段一样，但我不会称其为真正的改进。

请注意，一元 + 和 - 符号通常不是数字标记的一部分。考虑输入源“1-2”。您不希望将其标记为 2 个数字：NUMBER[1] 和 NUMBER[-2]，而是标记为 NUMBER[1] >、MINUS[-] 和 NUMBER[2] 以便您的解析器包含以下内容：

parse
  :  statement+ EOF
  ;

statement
  :  assignment
  ;

assignment
  :  IDENTIFIER '=' expression
  ;

expression
  :  addition
  ;

addition
  :  multiplication (('+' | '-') multiplication)*
  ;

multiplication
  :  unary (('*' | '/') unary)*
  ;

unary
  :  '-' atom
  |  '+' atom
  |  atom
  ;

atom
  :  NUMBER
  |  IDENTIFIER
  |  '(' expression ')'
  ;

IDENTIFIER
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | DIGIT)*
  ;

NUMBER 
  :  DIGIT+ '.' DIGIT* EXPONENT?
  |  '.'? DIGIT+ EXPONENT?
  ;

fragment 
EXPONENT 
  :  ('e' | 'E') ('+' | '-') ? DIGIT+ 
  ;

fragment
DIGIT  
  :  '0'..'9' 
  ;

并且 addition 将因此匹配输入 “1-2”。

编辑

像 111.222 + -456 这样的表达式将被解析为：

在此处输入图像描述

和+123 + -456 为：

在此处输入图像描述

To answer your question: no, there is no way to improve this AFAIK. You could place ('+' | '-') inside a fragment rule and use that fragment, just like the exponent-fragment, but I wouldn't call it a real improvement.

Note that unary + and - signs generally are not a part of a number-token. Consider the input source "1-2". You don't want that to be tokenized as 2 numbers: NUMBER[1] and NUMBER[-2], but as NUMBER[1], MINUS[-] and NUMBER[2] so that your parser contains the following:

parse
  :  statement+ EOF
  ;

statement
  :  assignment
  ;

assignment
  :  IDENTIFIER '=' expression
  ;

expression
  :  addition
  ;

addition
  :  multiplication (('+' | '-') multiplication)*
  ;

multiplication
  :  unary (('*' | '/') unary)*
  ;

unary
  :  '-' atom
  |  '+' atom
  |  atom
  ;

atom
  :  NUMBER
  |  IDENTIFIER
  |  '(' expression ')'
  ;

IDENTIFIER
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | DIGIT)*
  ;

NUMBER 
  :  DIGIT+ '.' DIGIT* EXPONENT?
  |  '.'? DIGIT+ EXPONENT?
  ;

fragment 
EXPONENT 
  :  ('e' | 'E') ('+' | '-') ? DIGIT+ 
  ;

fragment
DIGIT  
  :  '0'..'9' 
  ;

and addition will therefor match the input "1-2".