当首先需要其他字符时,ANTLR 匹配字符?
请参阅下面的语法。当我尝试解析时:
String s = "UNH+message refere+APERAK:D:97A:UN\n";
我收到以下错误:
line 1:34 mismatched character '\n' expecting 'H'
line 2:0 missing RDEL at '<EOF>'
这对我来说没有意义,因为它似乎在遇到 \n 之前寻找 UNH,这不遵循“文件”规则。
grammar Aperak;
options {
language = Java;
}
@header { package test.fixed.aperak; }
@lexer::header { package test.fixed.aperak; }
file returns [String result]: 'UNH' unh01 unh02 RDEL { $result = $unh01.text + " -- " + $unh02.text; };
unh01 : FDEL optField;
unh02 : FDEL unh02x1 unh02x2 unh02x3 unh02x4 (unh02x5)?;
unh02x1 : optField;
unh02x2 : SDEL optField;
unh02x3 : SDEL optField;
unh02x4 : SDEL optField;
unh02x5 : SDEL optField;
optField : AN*;
RDEL : '\n';
SDEL : ':';
FDEL : '+';
AN : 'a'..'z' | 'A'..'Z' | '0'..'9' | ' ';
See the grammar below. When I try to parse:
String s = "UNH+message refere+APERAK:D:97A:UN\n";
I get the following error:
line 1:34 mismatched character '\n' expecting 'H'
line 2:0 missing RDEL at '<EOF>'
Which doesn't make sense to me since it seems to be looking for UNH before encountering a \n, which would not follow the 'file' rule.
grammar Aperak;
options {
language = Java;
}
@header { package test.fixed.aperak; }
@lexer::header { package test.fixed.aperak; }
file returns [String result]: 'UNH' unh01 unh02 RDEL { $result = $unh01.text + " -- " + $unh02.text; };
unh01 : FDEL optField;
unh02 : FDEL unh02x1 unh02x2 unh02x3 unh02x4 (unh02x5)?;
unh02x1 : optField;
unh02x2 : SDEL optField;
unh02x3 : SDEL optField;
unh02x4 : SDEL optField;
unh02x5 : SDEL optField;
optField : AN*;
RDEL : '\n';
SDEL : ':';
FDEL : '+';
AN : 'a'..'z' | 'A'..'Z' | '0'..'9' | ' ';
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的词法分析器实际上如下所示:
file
规则中的文字'UNH'
成为位于所有其他词法分析器规则之上的词法分析器规则。当词法分析器现在偶然发现
"UN"
后跟"H"
以外的内容时,它会产生错误,因为词法分析器无处可回溯。如果您的AN
规则匹配了多个字符,则词法分析器可以遵循该规则,但由于它仅匹配单个字符,因此词法分析器不会从"UN"
。正如 dasblinkenlight 已经建议的那样,这是正确的:
AN
应匹配 1 个或多个字符,然后optField
可以匹配可选的AN
。但他(或她)答案的另一部分并不完全正确:因此是我的答案。Your lexer really looks like this:
The literal
'UNH'
in yourfile
rule becomes a lexer rule placed above all other lexer rules.When the lexer now stumbles upon
"UN"
followed by something other than"H"
, it produces an error because the lexer has nowhere to backtrack to. If yourAN
rule had matched more than a single character, the lexer could follow that rule, but since it only matches a single character, the lexer will not backtrack from"UN"
.As dasblinkenlight already suggested is correct:
AN
should match 1 or more characters andoptField
can then match an optionalAN
. The other part of his (or her) answer is not quite correct though: hence my answer.我认为 ANTLR 确实与覆盖
UNH
输入的两个重叠规则混淆了:UNH
或AN
类型的三个标记的序列>,文本为"U"
、"N"
和"H"
我认为您应该修改您的
optField 和
AN
规则来移动*
进入词法分析器,如下所示:I think ANTLR gets genuinely confused with two overlapping rules that cover the
UNH
input:UNH
, orAN
, with texts of"U"
,"N"
, and"H"
I think you should modify your
optField
andAN
rules to move the*
into the lexer, like this: