如何附加两个词法分析器表达式 - ANTLR4
我需要词法分析器将两个不同的字符表达式解析为一个表达式。
所以我有这样的东西,
rootPath : 'A' rootType SEP childPath; //我的输出应该是AB:2或AC:4
childPath : RESERVED_NUMBERS;
根类型:ONE_LETTER;
九月: ':' RESERVED_NUMBERS:[1-9] ONE_LETTER :[AZ]
我在解析此内容时遇到错误,如何将 'A' 和 ONE_LETTER 组合成单个字符串
I need lexer to parse two different character expressions as one expression.
So I've something like this,
rootPath : 'A' rootType SEP childPath; //my output should be AB:2 or AC:4
childPath : RESERVED_NUMBERS;
rootType : ONE_LETTER;
SEP: ':'
RESERVED_NUMBERS :[1-9]
ONE_LETTER : [A-Z]
I'm getting error when I'm parsing this, How can I combine 'A' and ONE_LETTER into single string
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据您的评论,您似乎希望将根级别和子级别的两个字母保留为单独的标记,但存在“冲突”(在您的
rootLevel
解析器规则中),您的 'A ' 标记文字和您的ONE_LETTER
规则均与“A”字符匹配。如果我是对的,那么你并不是真正的“附加 Lexer 表达式”。重要的是要认识到语法中的“A”只是定义 Lexer 规则的语法快捷方式(ANTLR 将使用类似
T__0
的名称创建它),因此它只是另一个 Lexer 规则。了解输入字符流用于创建供解析器使用的令牌流也很重要。解析器规则无法控制“A”是否匹配
T__0
('A') 规则或ONE_LETTER
规则。该决定是由 Tokenizer 做出的,它必须仅查看输入字符流来选择一个。考虑到这一点,您可能不应该尝试对抗词法分析器,而应该允许两个字符都被识别为 ONE_LETTER 标记,并向您的 rootPath 规则添加语义谓词:
现在,仅当
rootLevel
ONE_LETTER
标记为“A”时,rootPath
规则才会匹配,并且您将拥有rootLevel
和RootPathContext
类中的subLevel
字段。Based upon your comments, it appears that you want to keep the two letters of your root level and sub level as separate tokens, but have a "conflict" (in you
rootLevel
parser rule) that your 'A' token literal and yourONE_LETTER
rule both match the "A" character. If I have this right, you're not really "appending Lexer expressions".It's important to recognize that the 'A' in your grammar is just a syntactic shortcut for defining a Lexer rule (ANTLR will create it with a name something like
T__0
), so it's just another Lexer rule.It's also important to understand that you stream of input characters are used to create a stream of Tokens to be used by the parser. There is nothing that a parser rule can do to control whether "A" matches the
T__0
('A') rule or theONE_LETTER
rule. That decision was made by the Tokenizer, and it has to pick one just looking at the stream of input characters.With that in mind, you should probably not try to fight the Lexer, but allow both characters to be recognized as
ONE_LETTER
tokens, and add a semantic predicate to yourrootPath
rule:now the
rootPath
rule will only match if therootLevel
ONE_LETTER
token is an "A", and you will haverootLevel
andsubLevel
fields in yourRootPathContext
class.