如何解决明确语法中的移位归约冲突

发布于 2024-07-22 12:09:13 字数 720 浏览 10 评论 0原文

我正在尝试使用 LALR(1) 解析器生成器(Bison,但问题并非特定于该工具)来解析简单的语法,并且遇到了移位归约冲突。 我发现的有关修复这些问题的文档和其他来源往往会说明以下一项或多项内容:

  • 如果语法不明确(例如 if-then-else 不明确),请更改语言以修复不明确之处。
  • 如果是运算符优先级问题,请明确指定优先级。
  • 接受默认分辨率并告诉生成器不要抱怨它。

然而,这些似乎都不适用于我的情况:据我所知,语法是明确的(当然,它只有一个前瞻字符是不明确的),它只有一个运算符,默认解析会导致解析错误在正确形成的输入上。 是否有任何技术可以重新设计语法的定义,以消除不属于上述范围的移位归约冲突?

具体而言,这里是有问题的语法:

%token LETTER

%%
%start input;
input:          /* empty */ | input input_elt;
input_elt:      rule | statement;
statement:      successor ';';
rule:           LETTER "->" successor ';';
successor:      /* empty */ | successor LETTER;
%%

目的是解析“[A-Za-z]+”或“[A-Za-z] -> [A-Za-z] 形式的分号分隔行+”。

I'm trying to parse a simple grammar using an LALR(1) parser generator (Bison, but the problem is not specific to that tool), and I'm hitting a shift-reduce conflict. The docs and other sources I've found about fixing these tend to say one or more of the following:

  • If the grammar is ambiguous (e.g. if-then-else ambiguity), change the language to fix the ambiguity.
  • If it's an operator precedence issue, specify precedence explicitly.
  • Accept the default resolution and tell the generator not to complain about it.

However, none of these seem to apply to my situation: the grammar is unambiguous so far as I can tell (though of course it's ambiguous with only one character of lookahead), it has only one operator, and the default resolution leads to parse errors on correctly-formed input. Are there any techniques for reworking the definition of a grammar to remove shift-reduce conflicts that don't fall into the above buckets?

For concreteness, here's the grammar in question:

%token LETTER

%%
%start input;
input:          /* empty */ | input input_elt;
input_elt:      rule | statement;
statement:      successor ';';
rule:           LETTER "->" successor ';';
successor:      /* empty */ | successor LETTER;
%%

The intent is to parse semicolon-separated lines of the form "[A-Za-z]+" or "[A-Za-z] -> [A-Za-z]+".

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

恰似旧人归 2024-07-29 12:09:13

使用 Solaris 版本的 yacc,我得到:

1: shift/reduce conflict (shift 5, red'n 7) on LETTER
state 1
    $accept :  input_$end
    input :  input_input_elt
    successor : _    (7)

    $end  accept
    LETTER  shift 5
    .  reduce 7

    input_elt  goto 2
    rule  goto 3
    statement  goto 4
    successor  goto 6

所以,问题是,正如经常出现的那样,空规则 - 具体来说,空后继者。 目前尚不完全清楚您是否希望允许分号作为有效输入 - 目前是这样。 如果将后继规则修改为:

successor: LETTER | successor LETTER;

则消除了移位/归约冲突。

Using the Solaris version of yacc, I get:

1: shift/reduce conflict (shift 5, red'n 7) on LETTER
state 1
    $accept :  input_$end
    input :  input_input_elt
    successor : _    (7)

    $end  accept
    LETTER  shift 5
    .  reduce 7

    input_elt  goto 2
    rule  goto 3
    statement  goto 4
    successor  goto 6

So, the trouble is, as it very often is, the empty rule - specifically, the empty successor. It isn't completely clear whether you want to allow a semi-colon as a valid input - at the moment, it is. If you modified the successor rule to:

successor: LETTER | successor LETTER;

the shift/reduce conflict is eliminated.

一身仙ぐ女味 2024-07-29 12:09:13

感谢您削减语法并发布它。 将继任者规则更改为 -

successor:      /* empty */ | LETTER successor;

...对我有用。 ITYM 这种语言看起来很明确。

Thanks for whittling down the grammar and posting it. Changing the successor rule to -

successor:      /* empty */ | LETTER successor;

...worked for me. ITYM the language looked unambiguous.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文