使用antlrworks解决左递归问题

发布于 2024-10-05 21:08:25 字数 1327 浏览 16 评论 0原文

您好，我想编写一个语法（使用 ANTLRWORKS），稍后接受（在调试模式下）此代码

repeat_until
    :'repeat' seq_statement 'until' exp
    ;

read    :

          'read' ID ';'  
    ;

    fragment    
Operation_stat
    :   (NUMBER|ID) OP (NUMBER|ID) 
    ;

OP  :   ('+'|'-'|'*'|'/')
    ;

NUMBER  :
'0'..'9'+   
    ;

LOG_OP  :
('<' | '>' | '=' | '<=' | '>=' )
    ;


ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;



FLOAT
    :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
    ;




fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

感谢您的帮助

原文

Hi I want to write a grammer( Using ANTLRWORKS ) that accept later ( in debugging mode ) this code

repeat_until
    :'repeat' seq_statement 'until' exp
    ;

read    :

          'read' ID ';'  
    ;

    fragment    
Operation_stat
    :   (NUMBER|ID) OP (NUMBER|ID) 
    ;

OP  :   ('+'|'-'|'*'|'/')
    ;

NUMBER  :
'0'..'9'+   
    ;

LOG_OP  :
('<' | '>' | '=' | '<=' | '>=' )
    ;


ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;



FLOAT
    :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
    ;




fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

Thanx for your help

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜警司 2024-10-12 21:08:25

我相信 ANLRWorks 有一个功能可以帮助从语法中删除左递归，尽管在我的记忆中，它只适用于非常基本的语法。自从我上次谈论这个问题以来已经有一段时间了，所以你必须在这方面调查一下自己。

要手动删除左递归，请参阅：http://www.antlr。 org/wiki/display/ANTLR3/Left-Recursion+Removal （确保浏览所有 3 个部分）

编辑

我不确定是否可以帮助你：你似乎完全忽略了 ANTLR 可以无法处理左递归语法。您的以下解析器规则：

seq_statement 
  :  seq_statement ';' statement 
  |  seq_statement
  ;

simple_exp
  :  simple_exp OP term 
  |  term    
  ;

term    
  :  term OP factor factor 
  |  factor  
  ;

显然都是递归的，我不知道如何更清楚地解释这一点。我的意思是，你难道看不出像这样的规则有什么问题吗

a
  : a b
  ;

？这与您的 seq_statement 规则基本相同。

我的印象是您正在尝试将一些现有语法转换为 ANTLR 语法。是这样吗？您真的知道左递归的真正含义吗？

编辑二

之类的东西：

parse
  :  block EOF 
  ;

block
  :  statement (';' statement)* ';'? 
  ;

statement
  :  'read' expression  
  |  'write' expression 
  |  ifStatement
  |  repeatStatement
  |  assignment
  ;

ifStatement
  :  'if' expression 'then' block? ('else' block?)? 'end' 
  ;

repeatStatement
  :  'repeat' block? 'until' expression 
  ;

assignment
  :  Identifier ':=' expression 
  ;

expression
  :  equalityExp
  ;

equalityExp
  :  relationalExp (('=' | '!=') relationalExp)*
  ;

relationalExp
  :  additiveExp (('>=' | '<=' | '>' | '<') additiveExp)*
  ;

additiveExp
  :  multiplicativeExp (('+' | '-') multiplicativeExp)*
  ;

multiplicativeExp
  :  atom (('*' | '/' | '%') atom)*
  ;

atom
  :  Identifier
  |  Int
  |  '(' expression ')' 
  ;

Int
  :  '0'..'9'+
  ;

Identifier
  :  'a'..'z'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

应该可以解决问题。

I believe ANLRWorks has a feature to help remove left-recursion from a grammar, although, in my memory, it only works with very basic grammars. It's been a while since I last worded with it, so you have to investigate yourself on that front.

To manually remove left-recursion, see: http://www.antlr.org/wiki/display/ANTLR3/Left-Recursion+Removal (make sure to go through all 3 sections)

EDIT

I'm not sure if I can help you: you seem to be totally missing the point that ANTLR can't cope with left-recursive grammars. Your following parser rules:

seq_statement 
  :  seq_statement ';' statement 
  |  seq_statement
  ;

simple_exp
  :  simple_exp OP term 
  |  term    
  ;

term    
  :  term OP factor factor 
  |  factor  
  ;

are all so obviously left recursive, that I am not sure how to explain this any clearer. I mean, can't you see what's wrong with a rule like:

a
  : a b
  ;

? Which is basically the same as your seq_statement rule.

I get the impression you're trying to convert some existing grammar into an ANTLR grammar. Is this the case? And do you really know what left-recursion really means?

EDIT II

Something like:

parse
  :  block EOF 
  ;

block
  :  statement (';' statement)* ';'? 
  ;

statement
  :  'read' expression  
  |  'write' expression 
  |  ifStatement
  |  repeatStatement
  |  assignment
  ;

ifStatement
  :  'if' expression 'then' block? ('else' block?)? 'end' 
  ;

repeatStatement
  :  'repeat' block? 'until' expression 
  ;

assignment
  :  Identifier ':=' expression 
  ;

expression
  :  equalityExp
  ;

equalityExp
  :  relationalExp (('=' | '!=') relationalExp)*
  ;

relationalExp
  :  additiveExp (('>=' | '<=' | '>' | '<') additiveExp)*
  ;

additiveExp
  :  multiplicativeExp (('+' | '-') multiplicativeExp)*
  ;

multiplicativeExp
  :  atom (('*' | '/' | '%') atom)*
  ;

atom
  :  Identifier
  |  Int
  |  '(' expression ')' 
  ;

Int
  :  '0'..'9'+
  ;

Identifier
  :  'a'..'z'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

ought to do the trick.

回复收藏 0 原文

~没有更多了~