ANTLR:如何使用词法分析器解析匹配括号内的区域
我想在我的词法分析器中解析类似的内容:
( begin expression )
其中表达式也被括号包围。表达式中的内容并不重要,我只想将 (begin
和匹配的 )
之间的所有内容作为令牌。一个例子是:
(begin
(define x (+ 1 2)))
所以令牌的文本应该是
(define x (+ 1 2)))
类似的东西
PROGRAM : LPAREN BEGIN .* RPAREN;
(显然)不起作用,因为一旦他看到“)”,他就认为规则已经结束,但我需要匹配的括号。
我怎样才能做到这一点?
i want to parse something like this in my lexer:
( begin expression )
where expressions are also surrounded by brackets. it isn't important what is in the expression, i just want to have all what's between the (begin
and the matching )
as a token. an example would be:
(begin
(define x (+ 1 2)))
so the text of the token should be
(define x (+ 1 2)))
something like
PROGRAM : LPAREN BEGIN .* RPAREN;
does (obviously) not work because as soon as he sees a ")", he thinks the rule is over, but i need the matching bracket for this.
how can i do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在词法分析器规则内,您可以递归地调用规则。所以,这是解决这个问题的一种方法。另一种方法是跟踪左括号和右括号的数量,并让 门控语义谓词 只要您的计数器大于零,就会循环。
演示:
Tg
Main.java
请注意,您需要注意源代码中可能包含括号的字符串文字:
或可能包含括号的注释。
带有谓词的建议使用一些特定于语言的代码(在本例中为 Java)。递归调用词法分析器规则的优点是您的词法分析器中没有自定义代码:
Inside lexer rules, you can invoke rules recursively. So, that's one way to solve this. Another approach would be to keep track of the number of open- and close parenthesis and let a gated semantic predicate loop as long as your counter is more than zero.
A demo:
T.g
Main.java
Note that you'll need to beware of string literals inside your source that might include parenthesis:
or comments that may contain parenthesis.
The suggestion with the predicate uses some language specific code (Java, in this case). An advantage of calling a lexer rule recursively is that you don't have custom code in your lexer: