Bison:如果令牌不符合规则,如何忽略它
我正在编写一个程序来处理评论以及其他一些事情。如果评论位于特定位置,那么我的程序就会执行某些操作。
Flex 在找到评论时会传递一个令牌,然后 Bison 会查看该令牌是否符合特定规则。如果是,则它将采取与该规则相关的操作。
事情是这样的:我收到的输入实际上可能在错误的地方有评论。在这种情况下,我只想忽略评论而不是标记错误。
我的问题:
如果令牌符合规则,我如何使用它,但如果不符合规则则忽略它?我可以将标记设置为“可选”吗?
(注意:我现在能想到的唯一方法是将评论标记分散在每个可能的规则中的每个可能的位置。必须有比这更好的解决方案也许有一些涉及根的规则?)
I'm writing a program that handles comments as well as a few other things. If a comment is in a specific place, then my program does something.
Flex passes a token upon finding a comment, and Bison then looks to see if that token fits into a particular rule. If it does, then it takes an action associated with that rule.
Here's the thing: the input I'm receiving might actually have comments in the wrong places. In this case, I just want to ignore the comment rather than flagging an error.
My question:
How can I use a token if it fits into a rule, but ignore it if it doesn't? Can I make a token "optional"?
(Note: The only way I can think of of doing this right now is scattering the comment token in every possible place in every possible rule. There MUST be a better solution than this. Maybe some rule involving the root?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种解决方案可能是使用 bison 的错误恢复(请参阅 Bison 手册< /a>)。
总而言之,bison 定义了终端标记
error
来表示错误(例如,在错误位置返回的注释标记)。这样,您可以(例如)在找到任性的注释后关闭括号或大括号。然而,这种方法可能会丢弃一定量的解析,因为我不认为 bison 可以“撤消”减少。 (“标记”错误,就像将消息打印到 stderr 一样,与此无关:您可以有错误而不打印错误 - 这取决于如何您定义yyerror
。)您可能希望将每个终端包装在一个特殊的非终端中:
这有效地完成了您害怕做的事情(在每个规则中添加注释),但它确实做到了在更少的地方。
为了强迫自己吃自己的狗粮,我为自己编造了一种愚蠢的语言。唯一的语法是
print; please
,但如果数字和please
之间(至少)有一个注释 (##
),则会以十六进制打印数字。像这样:
我的词法分析器:
和解析器:
所以,正如你所看到的,这并不完全是火箭科学,但它确实有效。由于空字符串在多个位置与
comment
匹配,因此存在移位/归约冲突。此外,没有规则可以在最后的please
和EOF
之间添加注释。但总的来说,我认为这是一个很好的例子。One solution may be to use bison's error recovery (see the Bison manual).
To summarize, bison defines the terminal token
error
to represent an error (say, a comment token returned in the wrong place). That way, you can (for example) close parentheses or braces after the wayward comment is found. However, this method will probably discard a certain amount of parsing, because I don't think bison can "undo" reductions. ("Flagging" the error, as with printing a message to stderr, is not related to this: you can have an error without printing an error--it depends on how you defineyyerror
.)You may instead want to wrap each terminal in a special nonterminal:
This effectively does what you're scared to do (put in a comment in every single rule), but it does it in fewer places.
To force myself to eat my own dog food, I made up a silly language for myself. The only syntax is
print <number> please
, but if there's (at least) one comment (##
) between the number and theplease
, it prints the number in hexadecimal, instead.Like this:
My lexer:
and the parser:
So, as you can see, not exactly rocket science, but it does the trick. There's a shift/reduce conflict in there, because of the empty string matching
comment
in multiple places. Also, there's no rule to fit comments in between the finalplease
andEOF
. But overall, I think it's a good example.在词法分析器级别将注释视为空格。
但保留两条单独的规则,一条用于空格,一条用于注释,两者都返回相同的令牌 ID。
当您输入该“特定位置”时,请查看最后一个空格是否是注释或触发错误。
Treat comments as whitespace at the lexer level.
But keep two separate rules, one for whitespace and one for comments, both returning the same token ID.
When you enter that “specific place”, look if the last whitespace was a comment or trigger an error.