带有字符串文字的 ANTLR 解析器规则
假设我的解析器规则如下所示:
rule1 : 'functionA' '(' expression-inside-parenthesis ')';
expression-inside-parenthesis: ....;
但我从未为“functionA”、“(”和“)”定义任何词法分析器规则。这些会被解析器视为标记吗?对于“(”和“)”,无论如何都只有 1 个字符,我认为没有区别。但是对于“functionA”,如果我从未在词法分析器规则中将其定义为标记,那么解析器如何将其视为标记?
Say if my parser rules look like this:
rule1 : 'functionA' '(' expression-inside-parenthesis ')';
expression-inside-parenthesis: ....;
But I never defined any lexer rule for 'functionA', '(' and ')'. Would these be considered tokens by the parser? For '(' and ')', there is only 1 character anyway and I suppose there would be no difference. But for 'functionA', if I never defined it as a token in my lexer rules, how could the parser see it as a token?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
ANTLR 在幕后为您创建一个代币。
规则:
等效于:
如果标记仅包含 1 个字符并且不出现在其他标记中,例如
'('
和')'
,则为可以在解析器规则中“动态”定义它们,只要您的词法分析器语法也包含类似标识符的标记,最好自己在词法分析器内显式定义像'functionA'
这样的标记语法。通过自己显式定义它们,词法分析器尝试对您的输入进行标记的顺序会更清楚。编辑
如果您使用了文字标记并定义了匹配相同的词法分析器规则,如下所示:
然后 ANTLR 将
parse
规则解释为:ANTLR creates a token for you behind the scenes.
The rule:
is equivalent to:
In case of tokens that only consist of 1 character and do not occur within other tokens, like
'('
and')'
, it is okay to define them "on the fly" inside your parser rule, put as soon as your lexer grammar also contains identifier-like tokens, it's best to explicitly define a token like'functionA'
yourself inside the lexer grammar. By defining them yourself explicitly, it is clearer in what order the lexer tries to tokenize your input.EDIT
And in case you've used a literal-token and defined a lexer rule that matches the same, like this:
then ANTLR interprets the
parse
rule as this: