语法中的左递归会导致冲突

发布于 2024-08-10 09:58:32 字数 1215 浏览 0 评论 0原文

在整个 Bison 语法中,我使用右递归,并且我读到左递归更好,因为它不必首先构建整个堆栈。

然而,当我尝试在其中任何一个上切换到左递归时,我总是会遇到很多冲突,而且我不明白为什么。

任何人都可以向我展示一个通用示例,说明使用左递归而不是右递归会导致冲突(当右递归不会导致冲突时)。那么当向左切换时需要做什么来纠正这样的冲突。我认为一个基本的例子对我的帮助不仅仅是纠正我自己的语法。

编辑:
但我想我无论如何都应该包含一个特定的示例,因为我的理解还不够完整:-)将“列表分隔符命令”更改为“命令分隔符列表”可以解决冲突。

State 9 conflicts: 3 shift/reduce


Grammar

    0 $accept: input $end

    1 input: error NEWLINE
    2      | input NEWLINE
    3      | input list NEWLINE
    4      | /* empty */

    5 list: command
    6     | command separator
    7     | list separator command

    8 separator: SEMI
    9          | L_OR
   10          | L_AND

   11 command: variable_assignment
   12        | external_w_redir
   13        | external_w_redir AMP
   14        | pipeline
   15        | pipeline AMP

...

state 9

    5 list: command .
    6     | command . separator

    SEMI   shift, and go to state 18
    L_AND  shift, and go to state 19
    L_OR   shift, and go to state 20

    SEMI      [reduce using rule 5 (list)]
    L_AND     [reduce using rule 5 (list)]
    L_OR      [reduce using rule 5 (list)]
    $default  reduce using rule 5 (list)

    separator  go to state 22

Throughout a Bison grammar I am using right recursion, and I have read that left recursion is better because it doesn't have to build the whole stack first.

However, when I try to switch to left recursion on any of them, I always end up with a lot of conflicts, and I am not seeing why.

Can anyone show me a generic example of where using left recursion instead of right causes a conflict (when the right recursion doesn't cause a conflict). Then what needs to be done when switching to left to correct such a conflict. I think a fundamental example will help me more than just a correction of my own grammar.

Edit:
But I guess I should include a particular example anyways, since my understand is a little less than complete :-) Changing 'list separator command' to 'command separator list' resolves the conflict.

State 9 conflicts: 3 shift/reduce


Grammar

    0 $accept: input $end

    1 input: error NEWLINE
    2      | input NEWLINE
    3      | input list NEWLINE
    4      | /* empty */

    5 list: command
    6     | command separator
    7     | list separator command

    8 separator: SEMI
    9          | L_OR
   10          | L_AND

   11 command: variable_assignment
   12        | external_w_redir
   13        | external_w_redir AMP
   14        | pipeline
   15        | pipeline AMP

...

state 9

    5 list: command .
    6     | command . separator

    SEMI   shift, and go to state 18
    L_AND  shift, and go to state 19
    L_OR   shift, and go to state 20

    SEMI      [reduce using rule 5 (list)]
    L_AND     [reduce using rule 5 (list)]
    L_OR      [reduce using rule 5 (list)]
    $default  reduce using rule 5 (list)

    separator  go to state 22

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

っ〆星空下的拥抱 2024-08-17 09:58:32

编辑:

我必须撤回我原来的答案。正如我首先想到的那样,您的左递归语法似乎并不含糊。我想我对用于使最终分隔符可选的额外规则感到困惑。

这是原始右递归语法的简化版本:

list: COMMAND
      | COMMAND SEPARATOR
      | COMMAND SEPARATOR list
      ;

该语法匹配(如果我没有比我想象的更困惑,这当然是可能的)输入 C,CS,CSC,CSCS,CSCSC, CSCSCS 等。即,一系列由 SEPARATOR 分隔的命令,末尾有一个可选的 SEPARATOR。

这是左递归语法的简化版本,它在 Bison 中产生了移位/归约冲突:

list: COMMAND
      | COMMAND SEPARATOR
      | list SEPARATOR COMMAND
      ;

如果我正确扩展了内容,则此语法与输入 C、CS、CSC、CSSC、CSCSC、CSSCSC 等相匹配。它是没有歧义,但它不等于你的左递归语法。命令列表末尾不能有分隔符,并且命令之间的分隔符可以加倍。

我认为移位/归约冲突的原因是 Bison 在决定是否移位或归约时只向前看一个标记,并且使用第二个语法中引入的双分隔符,它有时会感到困惑。

如果最后的分隔符是可选的并且语法必须是左递归的,我建议重写它:

list: separatedlist
      | separatedlist SEPARATOR
      ;

separatedlist: COMMAND
               | separatedlist SEPARATOR COMMAND
               ;

但我不会担心左或右,除非你的列表真的 long,或者您打算在非常有限的硬件上运行解析器。我认为在现代台式计算机上这并不重要。

EDIT:

I have to retract my original answer. Your left-recursive grammar does not seem to be ambiguous, as I first thought. I think I got confused by the extra rule which is used to make a final separator optional.

Here is a simplified version of your original, right-recursive, grammar:

list: COMMAND
      | COMMAND SEPARATOR
      | COMMAND SEPARATOR list
      ;

This grammar matches (if I'm not even more confused than I think, which is of course a possibility) the inputs C, CS, CSC, CSCS, CSCSC, CSCSCS, etc. That is, a sequence of SEPARATOR-separated COMMANDs, with an optional SEPARATOR at the end.

And this is the simplified version of your left-recursive grammar, which gives a shift/reduce conflict in Bison:

list: COMMAND
      | COMMAND SEPARATOR
      | list SEPARATOR COMMAND
      ;

If I expanded things correctly, this grammar matches the inputs C, CS, CSC, CSSC, CSCSC, CSSCSC, etc. It is not ambiguous, but it is not equivalent to your left-recursive grammar. The list of COMMANDs can not have a SEPARATOR at the end, and the SEPARATORs between the COMMANDS can be doubled.

I think the reason for the shift/reduce conflict is that Bison only looks one token ahead when it decides if to shift or reduce, and with the double separators that are introduced in the second grammar, it can sometimes get confused.

If it is important that the final separator is optional, and that the grammar must be left-recursive, I would suggest re-writing it:

list: separatedlist
      | separatedlist SEPARATOR
      ;

separatedlist: COMMAND
               | separatedlist SEPARATOR COMMAND
               ;

But I wouldn't worry about left or right, unless your lists are really long, or you intend to run your parser on very limited hardware. I don't think it matters much, on modern desktop computers.

老旧海报 2024-08-17 09:58:32

混乱/错误似乎来自列表的制作。你的左递归版本是:

list: command
    | command separator
    | list separator command

你的右递归版本是:

list: command
    | command separator
    | command separator list

这两个规则集实际上是完全不同的,匹配不同的东西。第一个匹配一个或多个命令,命令之间有分隔符,并且每个命令后面有一个可选的额外分隔符 >。第二个类似,只是它只允许在LAST命令之后使用额外的可选分隔符

假设你真正想要的是后者的左递归版本,你想要的是

list_no_trailer: command
               | list_no_trailer separator command

list: list_no_trailer
    | list_no_trailer separator

你的版本中的冲突来自这样一个事实:在解析“命令”并在lookahed上看到分隔符之后,它不知道是否要将其作为命令后面的可选分隔符拉入,或者在假设它是列表分隔符并且后面会有另一个命令的情况下减少命令。为此,需要 2 个字符的前瞻,因此您的语法是 LR(2),而不是 LR(1)

The confusion/errors seem to come from the productions for list. Your left recursive version is:

list: command
    | command separator
    | list separator command

Your right recursive version is:

list: command
    | command separator
    | command separator list

These two rule sets are actually quite different, matching different things. The first one matches one or more commands with separators between them, and an optional extra separator after each command. The second is similar, except it only allows the extra optional separator after the LAST command.

Assuming what you really want is a left-recursive version of the latter, what you want is

list_no_trailer: command
               | list_no_trailer separator command

list: list_no_trailer
    | list_no_trailer separator

The conflict in your version come from the fact that after parsing a 'command' and seeing a separator on the lookahed, it doesn;t know whether to pull that in as the optional separator after the command, or to reduce the command under the assumption that its a list separator and there will be another command right after it. It would need 2 characters of lookahead for that, so your grammar is LR(2), but not LR(1)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文