如何解决 ANTLR 错误“词法分析器操作中不允许属性引用”

发布于 2025-01-09 09:58:26 字数 1255 浏览 2 评论 0原文

读完《The Definitive ANTLR 4 Reference》第10章后,我尝试编写一个简单的分析器来获取词法属性,但出现错误。如何获取词汇属性?

lexer grammar TestLexer;

SPACE:                       [ \t\r\n]+ -> skip;

LINE:                        INT DOT [a-z]+ {System.out.println($INT.text);};
INT:                         [0-9]+;
DOT:                         '.';
[INFO] 
[INFO] --- antlr4-maven-plugin:4.9.2:antlr4 (antlr) @ parser ---
[INFO] ANTLR 4: Processing source directory /Users/Poison/IdeaProjects/parser/src/main/antlr4
[INFO] Processing grammar: me.tianshuang.parser/TestLexer.g4
[ERROR] error(128): me.tianshuang.parser/TestLexer.g4:5:65: attribute references not allowed in lexer actions: $INT.text
[ERROR] /Users/Poison/IdeaProjects/parser/me.tianshuang.parser/TestLexer.g4 [5:65]: attribute references not allowed in lexer actions: $INT.text

ANTLR4版本:4.9.2。

参考:
antlr4/actions.md at master · antlr/antlr4 · GitHub
如何获取 Antlr-4 词法分析器规则操作中的标记属性 · 问题 #1946 · antlr/antlr4 · GitHub

After reading Chapter 10 of "The Definitive ANTLR 4 Reference", I tried to write a simple analyzer to get lexical attributes, but I got an error. How can I get the lexical attributes?

lexer grammar TestLexer;

SPACE:                       [ \t\r\n]+ -> skip;

LINE:                        INT DOT [a-z]+ {System.out.println($INT.text);};
INT:                         [0-9]+;
DOT:                         '.';
[INFO] 
[INFO] --- antlr4-maven-plugin:4.9.2:antlr4 (antlr) @ parser ---
[INFO] ANTLR 4: Processing source directory /Users/Poison/IdeaProjects/parser/src/main/antlr4
[INFO] Processing grammar: me.tianshuang.parser/TestLexer.g4
[ERROR] error(128): me.tianshuang.parser/TestLexer.g4:5:65: attribute references not allowed in lexer actions: $INT.text
[ERROR] /Users/Poison/IdeaProjects/parser/me.tianshuang.parser/TestLexer.g4 [5:65]: attribute references not allowed in lexer actions: $INT.text

ANTLR4 version: 4.9.2.

Reference:
antlr4/actions.md at master · antlr/antlr4 · GitHub
How to get the token attributes in Antlr-4 lexer rule's action · Issue #1946 · antlr/antlr4 · GitHub

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

圈圈圆圆圈圈 2025-01-16 09:58:26

如何获取词法属性?

你不能:词法分析器规则根本不支持标签。您可能会说,“好吧,但我没有使用任何标签!”。但以下:

INT DOT [a-z]+ {System.out.println($INT.text);}

只是以下的简写符号:

some_var_name=INT DOT [a-z]+ {System.out.println($some_var_name.text);}

其中 some_var_name 称为标签

如果删除嵌入的代码({} 之间的内容),在 INT 之前添加标签,然后生成词法分析器,您将请参阅打印到 stderr 的以下警告:

ANTLR 4 不支持词法分析器规则中的标签; actions 不能引用词法规则的元素,但您可以使用 getText() 获取与规则匹配的整个文本

最后一部分意味着您可以像这样抓取词法分析器规则的整个文本:

LINE
 : INT DOT [a-z]+ {System.out.println(getText());}
 ;

但是从词法分析器的各个部分抓取文本规则是不可能的。

How can I get the lexical attributes?

You can't: labels are simply not supported in lexer rules. You might say, "well, but I'm not using any labels!". But the following:

INT DOT [a-z]+ {System.out.println($INT.text);}

is just a shorthand notation for:

some_var_name=INT DOT [a-z]+ {System.out.println($some_var_name.text);}

where some_var_name is called a label.

If you remove the embedded code (the stuff between { and }), add a label before INT and then generate a lexer, you'll see the following warning being printed to stderr:

labels in lexer rules are not supported in ANTLR 4; actions cannot reference elements of lexical rules but you can use getText() to get the entire text matched for the rule

The last part means that you can grab the entire text of the lexer rule like this:

LINE
 : INT DOT [a-z]+ {System.out.println(getText());}
 ;

But grabbing text from individual parts of a lexer rule is not possible.

一身骄傲 2025-01-16 09:58:26

尝试分离词法分析器和其他输出问题的关注点:这是 Antlr 的主要焦点VS 野牛/Flex。例如,您可以使用本书其他章节中的访问者/听众模式。

Try to separate the concerns of the lexer and other output matters: that's a main focus point of Antlr VS Bison/Flex. You can use for example visitor/listener patterns from the other chapters of the book.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文