柠檬解析器是 LALR(1) 还是 SLR(1)?
我正在阅读柠檬解析器的 PHP 移植:
for ($i = 0; $i < $this->nstate; $i++) { /* Loop over all states */
$stp = $this->sorted[$i]->data;
for ($cfp = $stp->cfp; $cfp; $cfp = $cfp->next) {
/* Loop over all configurations */
if ($cfp->rp->nrhs == $cfp->dot) { /* Is dot at extreme right? */
for ($j = 0; $j < $this->nterminal; $j++) {
if (isset($cfp->fws[$j])) {
/* Add a reduce action to the state "stp" which will reduce by the
** rule "cfp->rp" if the lookahead symbol is "$this->symbols[j]" */
PHP_ParserGenerator_Action::Action_add($stp->ap, PHP_ParserGenerator_Action::REDUCE,
$this->symbols[$j], $cfp->rp);
}
}
}
}
}
在我看来,根据它如何计算操作表,解析器是一个 SLR(1) 解析器,但是@柠檬的主页,它将自己展示为 LALR(1) ) 解析器:
http://www.hwaci.com/sw/lemon/
是 SLR(1) 还是 LALR(1) ?
I'm reading the PHP portation of the lemon parser:
for ($i = 0; $i < $this->nstate; $i++) { /* Loop over all states */
$stp = $this->sorted[$i]->data;
for ($cfp = $stp->cfp; $cfp; $cfp = $cfp->next) {
/* Loop over all configurations */
if ($cfp->rp->nrhs == $cfp->dot) { /* Is dot at extreme right? */
for ($j = 0; $j < $this->nterminal; $j++) {
if (isset($cfp->fws[$j])) {
/* Add a reduce action to the state "stp" which will reduce by the
** rule "cfp->rp" if the lookahead symbol is "$this->symbols[j]" */
PHP_ParserGenerator_Action::Action_add($stp->ap, PHP_ParserGenerator_Action::REDUCE,
$this->symbols[$j], $cfp->rp);
}
}
}
}
}
It seems to me the parser is a SLR(1) parser according to how it computes the action table,but @the home page of lemon,it demonstrates itself as a LALR(1) parser:
http://www.hwaci.com/sw/lemon/
Is it SLR(1) or LALR(1) ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果是纯 SLR,则不会有任何前瞻符号 ($this->symbol[$j]) 用于控制缩减。所以我的结论是LALR(1)。
编辑:yoyo 是对的 SLR(1) 确实使用下一个输入符号来控制减少(我将问题误读为 [LALR(1) vs] SLR(0),这根本不在乎) ;我纠正了。在检查时,SLR(1)使用(产生式规则上下文无关)FOLLOW集合来控制约简; LALR(1) 使用(左上下文相关)LOOKAHEAD 集。因此,两者都对每次缩减设置了“前瞻”。这意味着你无法从这段代码片段中判断出它是哪一种;最好的情况是,我们希望编码器真正计算“一个”前瞻集。您必须查看其余代码才能知道它是什么类型。
实际上,如果您要构建一个自下而上的解析器生成器,您可以选择构建 SLR(0) [我曾经这样做过,这就是我的大脑误读问题的方式]、SLR(1)、LALR (1) 和 LR(1) 解析器生成器使用几乎完全相同的机制。 30年的经验表明,LALR(1)是其中最实用的,所以默认是...LALR(1); SLR(x) 严格来说是一个子集,那么如果稍微多一点努力就能得到 LALR(1),为什么还要费心这样做呢?如果 Lemon 实现者遵循传统,我期望有一个 LALR(1) 解析器生成器。所以现在你可以相信他们的话了。
当然,您可以构建一个简单的实验来说服自己。只需构建一个 SLR(1) 无法正确解析 LALR(1) 可以的语法,然后尝试一下。或者你可以仔细阅读代码。
请参阅 LALR 解析 http://en.wikipedia.org/wiki/LALR_parser
If it were pure SLR, there wouldn't be any lookahead symbols ($this->symbol[$j]) used to control a reduction. So I conclude it is LALR(1).
EDIT: yoyo is right SLR(1) does use next-input symbols to control reductions (I misread the question as [LALR(1) vs] SLR(0), which simply doesn't care); I stand corrected. In checking, SLR(1) uses the (production rule context-free) FOLLOW set to control reductions; LALR(1) uses the (left-context dependent) LOOKAHEAD set. So, both have a "lookahead" set on each reduction. That means you can't tell from this code fragment which kind it is; at best we hope the coder is really computing "a" lookahead set. You'd have to see the rest of the code to know what kind it is.
As a practical matter, if you are going to build a bottom up parser generator, you can choose to build SLR(0) [which I did once upon a time and that's how my brain misread the question), SLR(1), LALR(1), and LR(1) parser generators using almost the exact same machinery. 30 years of experience has shown that LALR(1) is the most practical of these, so the default is ... LALR(1); SLR(x) is strictly a subset so why bother doing that if only a tiny bit more effort gets you LALR(1)? If the Lemon implementer follows tradition, I'd expect an LALR(1) parser generator. So now you have sort of take their word for it.
Of course, you can construct a simple experiment to convince yourself. Simply build a grammar that SLR(1) can't propertly parse that LALR(1) can, and try it. Or you can read the code really carefully.
See LALR parsing at http://en.wikipedia.org/wiki/LALR_parser