隐藏令牌到默认通道 - AntlrV3

发布于 2024-08-20 06:10:07 字数 74 浏览 7 评论 0原文

假设我在隐藏通道中有空白(WS)。和 仅对于特定规则,我希望也考虑空格,是 是否可以在解析器中单独将 WS 引入该特定规则的默认通道?

Suppose I'm having white spaces (WS) in the hidden channel. And
for a particular rule alone, I want white spaces also to be considered, is
it possible to bring WS to the default channel for that particular rule alone in the parser?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

姐不稀罕 2024-08-27 06:10:07

查看路径问题的答案,注意我如何将 '\n' 放入解析器规则中。您也应该能够输入“”。现在,隐藏通道上的 WS 的所有选项是否都需要包含在规则中将是唯一需要关注的问题。

例如,

rulename : Token1 ' ' Token2 ' ' Token1 {place action here};

请注意,规则名称以小写字母开头,它是解析器规则,而“Token#”以大写字母开头,是词法分析器规则。在这个例子中,规则在不同的标记之间需要一个空格,我想你可以放置类似 (' '|'\t'|'\r'|'\n')+ 的东西,但我还没有尝试过这个,并且会留给你去尝试。

Have look at the answer for your path question, notice how I put a '\n' into the parser rule. You should be able to put ' ' as well. Now, do all the options for your WS on the hidden channel need to be in the rule would be the only concern.

eg

rulename : Token1 ' ' Token2 ' ' Token1 {place action here};

Please note that the rule name starts with a lowercase letter and it is a parser rule while the "Token#" start with uppercase letter and are lexer rules. In between the different tokens the rule requires a space in this example, and I suppose you could put something like (' '|'\t'|'\r'|'\n')+ but I have not tried this and will leave that for you to attempt.

笑着哭最痛 2024-08-27 06:10:07

查询隐藏标记流

您始终可以在 C++ 中

myrule: MYTOK { static_cast<antlr::CommonHiddenStreamToken*>(LT(1).get())->getHiddenAfter()->getType() == WS}? MYTOK 

ie语义谓词将在匹配词法标记 MYTOK 后检查是否存在空格标记

You can always query the hidden token stream

ie in C++

myrule: MYTOK { static_cast<antlr::CommonHiddenStreamToken*>(LT(1).get())->getHiddenAfter()->getType() == WS}? MYTOK 

The semantic predicate will check to see if there is a whitespace token after matching the lexical token MYTOK

笔落惊风雨 2024-08-27 06:10:07

词法分析器规则按照它们在语法文件中列出的顺序进行评估。

这意味着你可以有这样的东西:

STRING_LITERAL: '"' NONCONTROL_CHAR* '"';   


fragment NONCONTROL_CHAR: LETTER | DIGIT | UNDERSCORE |  SPACE | BACKSLASH | MINUS | COMMA;
fragment LETTER: LOWER | UPPER;
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment DIGIT: '0'..'9';
fragment SPACE: ' ' | '\t';
fragment UNDERSCORE: '_';   
fragment MINUS:  '-';
fragment BACKSLASH: '\\';

COMMA: ',';     

NEWLINE: ('\r'? '\n')+ { $channel = HIDDEN; };
TERMINATOR  : ';';


WHITESPACE: SPACE+ { $channel = HIDDEN; };

LINE_COMMENT
    :   
    '//' ~('\n'|'\r')*  ('\r\n' | '\r' | '\n') 
    {
        $channel = HIDDEN;
    }
    |   
    '//' ~('\n'|'\r')*     
    {
        $channel = HIDDEN;
    }
    ;   

正如你所看到的,字符串文字中可以有空格或制表符。然而,独立的空格或制表符将被发送到隐藏通道。

Lexer rules are evaluated in the order they are listed in your grammar file.

This means you can have something like this:

STRING_LITERAL: '"' NONCONTROL_CHAR* '"';   


fragment NONCONTROL_CHAR: LETTER | DIGIT | UNDERSCORE |  SPACE | BACKSLASH | MINUS | COMMA;
fragment LETTER: LOWER | UPPER;
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment DIGIT: '0'..'9';
fragment SPACE: ' ' | '\t';
fragment UNDERSCORE: '_';   
fragment MINUS:  '-';
fragment BACKSLASH: '\\';

COMMA: ',';     

NEWLINE: ('\r'? '\n')+ { $channel = HIDDEN; };
TERMINATOR  : ';';


WHITESPACE: SPACE+ { $channel = HIDDEN; };

LINE_COMMENT
    :   
    '//' ~('\n'|'\r')*  ('\r\n' | '\r' | '\n') 
    {
        $channel = HIDDEN;
    }
    |   
    '//' ~('\n'|'\r')*     
    {
        $channel = HIDDEN;
    }
    ;   

As you can see a string literal can have space or tabs in it. However a stand alone space or tab will be sent to the HIDDEN channel.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文