antlr不匹配正确的解析器规则
我正在尝试为旧的Eliza Chatbot程序的医生脚本创建一个解析器。
医生脚本在此处简化为简单的欢迎线,然后定义“医生”如何响应用户的输入“如果我很瘦”:
(I AM THE DOCTOR.)
(IF 3 ((0 IF 0)(DO YOU 3)(YOU WISH THAT 3)))
这是lexer:
ALL_CHARS: [0-9A-Z, .];
KEY_CHARS: [A-Z];
LPAREN: '(';
RPAREN: ')';
NUM: [0-9];
SPACE: ' ';
WS: ('\n')+ -> skip;
和解析器:
main: item* EOF;
item: (rWelcome | rKeyDecompReAssy );
rKeyDecompReAssy: LPAREN rKeyPri rDecompReAssy RPAREN;
rKeyPri: rKey SPACE rPri;
rKey: KEY_CHARS+;
rPri: NUM+;
rDecompReAssy: LPAREN rDecomp rReAssyList RPAREN;
rDecomp: LPAREN ALL_CHARS+ RPAREN;
rReAssyList: (rReAssy)+;
rReAssy: LPAREN reAssy RPAREN;
reAssy: ALL_CHARS+;
rWelcome: LPAREN reAssy RPAREN;
定义的规则欢迎 line( rwelcome ),如果 line( rdecompreassy ),则试图匹配4个组件:key,pri,Decomp和Reassylist。
我使用Android Studio的ANTLR预览。
问题在于,这两条线都与rwelcome匹配。
当然,欢迎线还可以,但是第二个错误消息是:
line 2:6 missing ')' at '('
line 2:45 mismatched input ')' expecting {<EOF>, '('}
我如何使这两个规则明确?
I am trying to create a parser for the DOCTOR script of the old ELIZA chatbot program.
The DOCTOR script, simplified here to a simple Welcome line followed by a line defining how "The Doctor" responds to a User input of say "IF ONLY I WAS THINNER" :
(I AM THE DOCTOR.)
(IF 3 ((0 IF 0)(DO YOU 3)(YOU WISH THAT 3)))
Here is the Lexer:
ALL_CHARS: [0-9A-Z, .];
KEY_CHARS: [A-Z];
LPAREN: '(';
RPAREN: ')';
NUM: [0-9];
SPACE: ' ';
WS: ('\n')+ -> skip;
and the Parser:
main: item* EOF;
item: (rWelcome | rKeyDecompReAssy );
rKeyDecompReAssy: LPAREN rKeyPri rDecompReAssy RPAREN;
rKeyPri: rKey SPACE rPri;
rKey: KEY_CHARS+;
rPri: NUM+;
rDecompReAssy: LPAREN rDecomp rReAssyList RPAREN;
rDecomp: LPAREN ALL_CHARS+ RPAREN;
rReAssyList: (rReAssy)+;
rReAssy: LPAREN reAssy RPAREN;
reAssy: ALL_CHARS+;
rWelcome: LPAREN reAssy RPAREN;
which defines a rule for the Welcome line (rWelcome) and one for the IF line (rDecompReAssy), which attempts to match 4 components: Key, Pri, Decomp and ReAssyList.
I use the ANTLR Preview of Android Studio.
The problem is that both lines are matched to rWelcome.
The Welcome line is OK of course, but the error message for the second is:
line 2:6 missing ')' at '('
line 2:45 mismatched input ')' expecting {<EOF>, '('}
How do I make the two rules unambiguous?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如注释中所述,您的Lexer永远不会创建
key_chars
- ,space
- 和num
-tokens。这是因为all_chars
令牌也与这些令牌中定义的字符匹配。当两个或多个Lexer规则与相同的字符匹配时,一个定义的首先“胜利”。无论解析器规则是否试图匹配key_chars
令牌,Lexer都会仅创建all_chars
token:lexer与解析器独立地工作。您能做的就是这样的事情:
As mentioned in the comment, your lexer never creates
KEY_CHARS
-,SPACE
- andNUM
-tokens. This is because theALL_CHARS
token also matches the chars defined in those tokens. And when 2 or more lexer rules match the same characters, the one defined first "wins". No matter if a parser rule is trying to match aKEY_CHARS
token, the lexer simply creates aALL_CHARS
token: the lexer works independently from the parser.What you could do is something like this: