当前位置：文江博客话题详情

如何附加两个词法分析器表达式 - ANTLR4

发布于 2025-01-11 03:07:15 字数 295 浏览 2 评论 0原文

我需要词法分析器将两个不同的字符表达式解析为一个表达式。

所以我有这样的东西，

rootPath : 'A' rootType SEP childPath; //我的输出应该是AB:2或AC:4

childPath : RESERVED_NUMBERS;

根类型：ONE_LETTER；

九月： '：' RESERVED_NUMBERS：[1-9] ONE_LETTER ：[AZ]

我在解析此内容时遇到错误，如何将 'A' 和 ONE_LETTER 组合成单个字符串

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枯寂 2025-01-18 03:07:15

根据您的评论，您似乎希望将根级别和子级别的两个字母保留为单独的标记，但存在“冲突”（在您的 rootLevel 解析器规则中），您的 'A ' 标记文字和您的 ONE_LETTER 规则均与“A”字符匹配。如果我是对的，那么你并不是真正的“附加 Lexer 表达式”。

重要的是要认识到语法中的“A”只是定义 Lexer 规则的语法快捷方式（ANTLR 将使用类似 T__0 的名称创建它），因此它只是另一个 Lexer 规则。

了解输入字符流用于创建供解析器使用的令牌流也很重要。解析器规则无法控制“A”是否匹配 T__0 ('A') 规则或 ONE_LETTER 规则。该决定是由 Tokenizer 做出的，它必须仅查看输入字符流来选择一个。

考虑到这一点，您可能不应该尝试对抗词法分析器，而应该允许两个字符都被识别为 ONE_LETTER 标记，并向您的 rootPath 规则添加语义谓词：

rootPath
    : rootLevel = 'A' {$rootLevel.text == "A" }? subLevel = ONE_LETTER SEP childPath
    ; //my output should be AB:2 or AC:4

现在，仅当 rootLevel ONE_LETTER 标记为“A”时，rootPath 规则才会匹配，并且您将拥有 rootLevel 和RootPathContext 类中的 subLevel 字段。

Based upon your comments, it appears that you want to keep the two letters of your root level and sub level as separate tokens, but have a "conflict" (in you rootLevel parser rule) that your 'A' token literal and your ONE_LETTER rule both match the "A" character. If I have this right, you're not really "appending Lexer expressions".

It's important to recognize that the 'A' in your grammar is just a syntactic shortcut for defining a Lexer rule (ANTLR will create it with a name something like T__0), so it's just another Lexer rule.

It's also important to understand that you stream of input characters are used to create a stream of Tokens to be used by the parser. There is nothing that a parser rule can do to control whether "A" matches the T__0 ('A') rule or the ONE_LETTER rule. That decision was made by the Tokenizer, and it has to pick one just looking at the stream of input characters.

With that in mind, you should probably not try to fight the Lexer, but allow both characters to be recognized as ONE_LETTER tokens, and add a semantic predicate to your rootPath rule:

rootPath
    : rootLevel = 'A' {$rootLevel.text == "A" }? subLevel = ONE_LETTER SEP childPath
    ; //my output should be AB:2 or AC:4

now the rootPath rule will only match if the rootLevel ONE_LETTER token is an "A", and you will have rootLevel and subLevel fields in your RootPathContext class.

回复收藏 0 原文

~没有更多了~