由于前瞻标记限制而转移/减少 yacc 中的冲突?

发布于 2024-10-12 04:25:53 字数 667 浏览 9 评论 0原文

我一直在尝试解决看似简单的转移/减少冲突,但无济于事。当然,如果我忽略冲突,解析器就可以正常工作,但如果我重新组织规则,我会感觉更安全。在这里,我将相对复杂的语法简化为单个冲突:

statement_list
  : statement_list statement 
  | 
  ;

statement
  : lvalue '=' expression
  | function
  ;

lvalue
  : IDENTIFIER
  | '(' expression ')'
  ;

expression
  : lvalue
  | function
  ;

function
  : IDENTIFIER '(' ')'
  ;

通过 yacc 中的详细选项,我得到了描述具有上述冲突的状态的输出文件:

state 2

    lvalue  ->  IDENTIFIER .   (rule 5)
    function  ->  IDENTIFIER . '(' ')'   (rule 9)

    '('  shift, and go to state 7

    '('  [reduce using rule 5 (lvalue)]
    $default reduce using rule 5 (lvalue)

感谢您的帮助。

I've been trying to tackle a seemingly simple shift/reduce conflict with no avail. Naturally, the parser works fine if I just ignore the conflict, but I'd feel much safer if I reorganized my rules. Here, I've simplified a relatively complex grammar to the single conflict:

statement_list
  : statement_list statement 
  | 
  ;

statement
  : lvalue '=' expression
  | function
  ;

lvalue
  : IDENTIFIER
  | '(' expression ')'
  ;

expression
  : lvalue
  | function
  ;

function
  : IDENTIFIER '(' ')'
  ;

With the verbose option in yacc, I get this output file describing the state with the mentioned conflict:

state 2

    lvalue  ->  IDENTIFIER .   (rule 5)
    function  ->  IDENTIFIER . '(' ')'   (rule 9)

    '('  shift, and go to state 7

    '('  [reduce using rule 5 (lvalue)]
    $default reduce using rule 5 (lvalue)

Thank you for any assistance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

少女的英雄梦 2024-10-19 04:25:54

问题在于,这需要 2 个令牌前瞻才能知道何时到达语句末尾。如果您有以下形式的输入:

ID = ID ( ID ) = ID

在解析器移动第二个 ID 之后(先行是 (),它不知道这是否是第一个语句的结尾(( ,这对于上面的示例输入来说是错误的。

是第二个语句的开头),或者这是一个函数,因此它会转移(继续解析函数),如果您扩展 function 允许括号内的参数和表达式允许实际表达式,事情会变得更糟,因为所需的前瞻是无限的——解析器需要一直到达第二个= 来确定这不是函数调用。

这里的基本问题是没有辅助标点符号来帮助解析器找到语句的结尾,因为作为有效语句开头的文本也可以出现在中间。对于一个有效的语句,找到语句边界是很困难的。

The problem is that this requires 2-token lookahead to know when it has reached the end of a statement. If you have input of the form:

ID = ID ( ID ) = ID

after parser shifts the second ID (lookahead is (), it doesn't know whether that's the end of the first statement (the ( is the beginning of a second statement), or this is a function. So it shifts (continuing to parse a function), which is the wrong thing to do with the example input above.

If you extend function to allow an argument inside the parenthesis and expression to allow actual expressions, things become worse, as the lookahead required is unbounded -- the parser needs to get all the way to the second = to determine that this is not a function call.

The basic problem here is that there's no helper punctuation to aid the parser in finding the end of a statement. Since text that is the beginning of a valid statement can also appear in the middle of a valid statement, finding statement boundaries is hard.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文