Push模式、pop模式示例

发布于 2025-01-09 04:20:11 字数 892 浏览 5 评论 0原文

我正在尝试学习 ANTLR 并遇到了 pushMode 和 popMode 词汇模式。

我在谷歌上浏览了很多关于这个的材料（包括大型教程），但未能获得pushMode和popMode的工作。

我的词法分析器语法是

lexer grammar MyLexer;
OPEN_QUOTE: '"' -> pushMode(STRING);
TEXT: [a-zA-Z]+ ;
NUMBER: [0-9]+;
mode STRING;
CLOSE_QUOTE: '"' -> popMode;
WORD: [a-zA-Z]+ ;
NUM: [0-9]+;

我的解析器

parser grammar MyParser;
options {tokenVocab=MyLexer;}
test: sentence string ;
sentence: (TEXT|NUMBER)+;
string: OPEN_QUOTE (WORD NUM) CLOSE_QUOTE;

是我的输入是

this is sentence 
"this is string"

我不相信对于第一个输入我使用的是在pushMode和模式中定义的词法分析器。对于第二条语句，我使用的是位于 pushMode 和 popMode 之外的词法分析器。我的印象是这应该是相反的方式，如下所示

parser grammar MyParser;
options {tokenVocab=MyLexer;}
test: sentence string ;
sentence: (WORD|NUM)+;
string: OPEN_QUOTE (TEXT|NUMBER) CLOSE_QUOTE;

有人可以帮助我理解这一点吗？

原文

I am trying to learn ANTLR and came across pushMode and popMode lexical modes.

I went through lots of material on google around this (including mega tutorial), but failed to get working of pushMode and popMode.

My Lexer grammar is

lexer grammar MyLexer;
OPEN_QUOTE: '"' -> pushMode(STRING);
TEXT: [a-zA-Z]+ ;
NUMBER: [0-9]+;
mode STRING;
CLOSE_QUOTE: '"' -> popMode;
WORD: [a-zA-Z]+ ;
NUM: [0-9]+;

My Parser is

parser grammar MyParser;
options {tokenVocab=MyLexer;}
test: sentence string ;
sentence: (TEXT|NUMBER)+;
string: OPEN_QUOTE (WORD NUM) CLOSE_QUOTE;

And my input is

this is sentence 
"this is string"

I am not convinced with thing that for first input I am using lexers which are defined in pushMode and mode. While for second statement I am using lexers which are out side pushMode and popMode.
I was under impression that this should be in reverse way as below

parser grammar MyParser;
options {tokenVocab=MyLexer;}
test: sentence string ;
sentence: (WORD|NUM)+;
string: OPEN_QUOTE (TEXT|NUMBER) CLOSE_QUOTE;

Can someone please help me understand this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜夜流光相皎洁 2025-01-16 04:20:11

您没有处理词法分析器规则中的空格。

常见规则是：

WS: [ \t\r\n]+ -> skip;

但是您不想跳过字符串中的空格。事实上你真的不想跳过任何事情。

尝试：

lexer grammar MyLexer;
OPEN_QUOTE: '"' -> pushMode(STRING);
TEXT: [a-zA-Z]+ ;
NUMBER: [0-9]+;
mode STRING;
CLOSE_QUOTE: '"' -> popMode;
STRING_CONTENT: ~["]* ;

~["]* 表示匹配任何不是 " 的内容。

在您的 Lexer 规则中，OPEN_QUOTE、TEXT 和 NUMBER 都是默认或顶级 Lexer 模式下的规则。当您遇到 OPEN_QUOTE 时，您将词法分析器“推入”STRING 模式，在该模式下，它只会查看 CLOSE_QUOTE、WORD 和 NUMBER 条规则。（当然，CLOSE_QUOTE 上的 popMode 会将词法分析器弹回顶级默认词法分析器模式。您应该考虑使用 grun 工具来转储令牌流（-tokens 选项），因为它可能会使这一点更清楚

注意：通常 STRING Lexer 规则比这更复杂（建议查看。其他语法中的 STRING 规则）但是，这应该可以处理您的测试以理解词法分析器模式。

You're not handling whitespace inside you lexer rules.

Common rule is:

WS: [ \t\r\n]+ -> skip;

You don't want to skip whitespace within a string however. In fact you really don't want to skip anything.

try:

lexer grammar MyLexer;
OPEN_QUOTE: '"' -> pushMode(STRING);
TEXT: [a-zA-Z]+ ;
NUMBER: [0-9]+;
mode STRING;
CLOSE_QUOTE: '"' -> popMode;
STRING_CONTENT: ~["]* ;

The ~["]* says to match anything that is not a ".

in your Lexer rules, OPEN_QUOTE, TEXT, and NUMBER are all rules in the default, or top-level Lexer mode. When you encounter the OPEN_QUOTE you "push" the Lexer into the STRING mode, where it will only look at the CLOSE_QUOTE, WORD, and NUMBER rules. (Of course, the popMode on CLOSE_QUOTE pops the lexer back into the top-level default Lexer mode. You should consider using the grun tool to dump out the token stream (-tokens option), as it might make this a bit more clear.

Note: generally STRING Lexer rules are more involved than this (suggest looking at STRING rules in other grammars). But, this should handle your test to understand Lexer modes.

回复收藏 0 原文

~没有更多了~