解决 LALR 解析器中的移位/归约冲突
我一直在使用 PLY 为我的语言构建解析器,但是我遇到了移位/归约冲突,这给我带来了一些麻烦。我的语言具有通用类型,其语法与 C++ 模板相同。所以现在我有这样的规则:
expression : expression LESS expression %prec COMPARISON
expression : template
template : NAME
| NAME LESS templates GREATER
templates : template
| templates COMMA template
但是,我发现它无法解析:(
a < 2
由于显而易见的原因,这是一个问题)。以下是调试输出:
PLY: PARSE DEBUG START
State : 0
Stack : . <Token: 'NAME' 'a'>
Action : Shift and goto state 42
State : 42
Stack : NAME . <Token: 'LESS' '<'>
Action : Shift and goto state 81
State : 81
Stack : NAME LESS . <Token: 'NUMBER' '2'>
ERROR: Error : NAME LESS . <Token: 'NUMBER' '2'>
如果需要更多我的解析器,我可以提供它。谢谢。
编辑:向我建议的一种解决方案是使类型成为自己的令牌。这需要一些工作,因为我的语言不使用像 C/C++ 这样的预处理器包含系统,但我认为它仍然是可能的,但我更喜欢仅限于语法的解决方案。
I've been using PLY to build up a parser for my language, however I've got a shift/reduce conflict that's causing me some trouble. My language has generic types with a syntax ala C++ templates. So right now I have rules like:
expression : expression LESS expression %prec COMPARISON
expression : template
template : NAME
| NAME LESS templates GREATER
templates : template
| templates COMMA template
However, I've found that it's unable to parse:
a < 2
(which is a problem for obvious reasons). The following is the debug output:
PLY: PARSE DEBUG START
State : 0
Stack : . <Token: 'NAME' 'a'>
Action : Shift and goto state 42
State : 42
Stack : NAME . <Token: 'LESS' '<'>
Action : Shift and goto state 81
State : 81
Stack : NAME LESS . <Token: 'NUMBER' '2'>
ERROR: Error : NAME LESS . <Token: 'NUMBER' '2'>
If more of my parser is needed I can provide it. Thanks.
EDIT: One solution that was suggested to me was to make types their own token. This would require a little bit of work because my language doesn't use a preprocessor include system like C/C++ however I think it would still be possible, I'd prefer a solution restricted to the grammar however.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Yacc 解析器并不是特别强大,尝试上下文无关解析可能要求太多。我建议使用某种技巧让 yacc 表现得像解析上下文相关的语法,或者不要尝试使用解析器来强制执行每个语法规则。
识别何时解析类型,设置标志或调用方法将其传达给扫描器,然后为
<
和>
。或者,继续使用统一的表达式/模板语法进行部分模板生成,并在代码中错误地排除除模板语法之外的任何内容。解析器是系统中功能最弱的部分,因此请尽可能将工作放入代码中。 (对代码没有限制,对 yacc 有很多限制。)
我并不是说这些是您唯一的选择。如果您花了几天时间研究状态表并将语法调整到 yacc 满意的程度,我想您会“成功”,但这是不值得的。那时您可能刚刚编写了一个递归下降解析器。 (RD 是更多的代码行,你看不到 BNFish yacc 中整齐排列的语法,但至少你可以解析任何东西,并且你永远不会陷入“它不起作用”的难题中。)
Python 有吗有任何相当于 Ruby 的 Treetop 的东西吗?这样就解决了问题。 Bison 的
%glr-parser
功能也可以“解决”这样的问题,尽管是以一种相当 BFI 的方式。Yacc parsers are not particularly powerful and attempting a context-free parse may be asking too much of it. I suggest using some kind of a trick to make yacc act like it is parsing a context-sensitive grammar, or, don't try to use the parser to enforce every syntax rule.
Recognize when you are parsing a type, set a flag or call a method to communicate this to the scanner, and then return a different terminal symbol for
<
and>
in this case.Alternatively, go ahead and just use a unified expression/template syntax for part of the template production, and error out anything other than template syntax in code. The parser is the least-capable part of your system, so push work into code where possible. (No limits to code, lots of limits to yacc.)
I'm not saying that these are your only choices. If you spent days puzzling over the state tables and tweaking the grammar to the point where yacc is happy with it, I imagine you would be "successful", but it's not worth it. At that point you may as well have just written a recursive descent parser. (RD is more lines of code, and you don't get to see the grammar neatly laid out in BNFish yacc, but at least you can parse anything and you never get bogged down with "it's not working" puzzles.)
Does Python have any equivalent to Ruby's Treetop? That would solve the problem. Bison's
%glr-parser
feature can also "solve" problems like this, albeit in a rather BFI way.