ANTLR 语法在解析类似规则时不会回溯

发布于 2025-01-06 05:08:02 字数 464 浏览 6 评论 0原文

假设我有一个语法，它负责全局变量和 C 的某些变体的一些方法声明，

program: (declaration)* (procedure)*;
declaration: typespec identifier ';';
procedure: typespec identifier '(' ')' ';';
typespec: 'char' | 'int';
identifier: ('a' .. 'z' | 'A' .. 'Z') ('A' - 'Z' | 'a' .. 'z' | '0' .. '9' | '_')*;

如果我向它提供类似以下内容：

int MAX;
char proc();

语法读取 int MAX;正确，但随后它想要将声明规则也应用到第二行，并且当它到达 ( 时失败，此时我希望它回溯并应用下一个规则，即程序规则。有人可以告诉我吗为什么这没有发生？

原文

Suppose I have a grammar which takes care of the global variables and some method declarations of some variation of C

program: (declaration)* (procedure)*;
declaration: typespec identifier ';';
procedure: typespec identifier '(' ')' ';';
typespec: 'char' | 'int';
identifier: ('a' .. 'z' | 'A' .. 'Z') ('A' - 'Z' | 'a' .. 'z' | '0' .. '9' | '_')*;

If I feed it something like:

int MAX;
char proc();

the grammar reads int MAX; correctly but then it wants to apply the declaration rule also to the 2nd row, and it fails when it reaches (, and at this point I expect it to backtrack and apply the next rule which is the one for procedure. Could somebody please tell me why this isn't happening?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深居我梦 2025-01-13 05:08:02

你把你的语法都贴出来了吗？我无法按照您发布的方式编译它...但我尝试了您发布的内容以使其与您的示例相匹配：

program: (declaration)* (procedure)*;
statement: TYPE_SPEC IDENT ;
declaration: statement ';';
procedure: statement '(' ')' ';';

TYPE_SPEC 
    :   'char' | 'int';

IDENT 
    :   ('a' .. 'z' | 'A' .. 'Z') ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' | '_')*;

WHITESPACE
    :   ('\r' | '\n' | '\r\n' | ' ' | '\t' ) {$channel=HIDDEN;} 
    ;

我建议您为标记匹配制定词法分析器规则（大写字母），而不是使它们成为您的解析器规则的一部分 - 如您所见，我已经为您完成了其中一些规则。

Did you post all of your grammar? I couldn't get it to compile as you posted...but I played around with what you posted to make it match your example:

program: (declaration)* (procedure)*;
statement: TYPE_SPEC IDENT ;
declaration: statement ';';
procedure: statement '(' ')' ';';

TYPE_SPEC 
    :   'char' | 'int';

IDENT 
    :   ('a' .. 'z' | 'A' .. 'Z') ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' | '_')*;

WHITESPACE
    :   ('\r' | '\n' | '\r\n' | ' ' | '\t' ) {$channel=HIDDEN;} 
    ;

I'd recommend that your make lexer rules (The ones in capitals) for your token matching rather than making them part of your parser rules - I've done some of them already for you as you can see.

回复收藏 0 原文

~没有更多了~