Python/YACC:解决移位/归约冲突

发布于 2024-09-03 08:30:32 字数 1686 浏览 9 评论 0原文

我正在使用 PLY。这是我来自 parser.out 的状态之一:

state 3

    (5) course_data -> course .
    (6) course_data -> course . course_list_tail
    (3) or_phrase -> course . OR_CONJ COURSE_NUMBER
    (7) course_list_tail -> . , COURSE_NUMBER
    (8) course_list_tail -> . , COURSE_NUMBER course_list_tail

  ! shift/reduce conflict for OR_CONJ resolved as shift
    $end            reduce using rule 5 (course_data -> course .)
    OR_CONJ         shift and go to state 7
    ,               shift and go to state 8

  ! OR_CONJ         [ reduce using rule 5 (course_data -> course .) ]

    course_list_tail               shift and go to state 9

我想将其解决为:

if OR_CONJ is followed by COURSE_NUMBER:
    shift and go to state 7
else:
    reduce using rule 5 (course_data -> course .)

如何修复我的解析器文件以反映这一点?我是否需要通过回溯并尝试不同的规则来处理语法错误?

文档说:

然后使用这些值附加一个 数值优先值和 每个的关联方向 语法规则。这总是 通过查看确定 最右边终端的优先级 符号。

如果规则没有终端怎么办?

更新:完整语法:

Grammar

Rule 0     S' -> statement
Rule 1     statement -> course_data
Rule 2     or_phrase -> statement OR_CONJ statement
Rule 3     or_phrase -> course OR_CONJ COURSE_NUMBER
Rule 4     statement -> or_phrase
Rule 5     course_data -> course
Rule 6     course_data -> course course_list_tail
Rule 7     course_list_tail -> , COURSE_NUMBER
Rule 8     course_list_tail -> , COURSE_NUMBER course_list_tail
Rule 9     course -> DEPT_CODE COURSE_NUMBER

I'm using PLY. Here is one of my states from parser.out:

state 3

    (5) course_data -> course .
    (6) course_data -> course . course_list_tail
    (3) or_phrase -> course . OR_CONJ COURSE_NUMBER
    (7) course_list_tail -> . , COURSE_NUMBER
    (8) course_list_tail -> . , COURSE_NUMBER course_list_tail

  ! shift/reduce conflict for OR_CONJ resolved as shift
    $end            reduce using rule 5 (course_data -> course .)
    OR_CONJ         shift and go to state 7
    ,               shift and go to state 8

  ! OR_CONJ         [ reduce using rule 5 (course_data -> course .) ]

    course_list_tail               shift and go to state 9

I want to resolve this as:

if OR_CONJ is followed by COURSE_NUMBER:
    shift and go to state 7
else:
    reduce using rule 5 (course_data -> course .)

How can I fix my parser file to reflect this? Do I need to handle a syntax error by backtracking and trying a different rule?

The documentation says:

These values are then used to attach a
numerical precedence value and
associativity direction to each
grammar rule. This is always
determined by looking at the
precedence of the right-most terminal
symbol.

What if the rule has no terminals?

UPDATE: The complete grammar:

Grammar

Rule 0     S' -> statement
Rule 1     statement -> course_data
Rule 2     or_phrase -> statement OR_CONJ statement
Rule 3     or_phrase -> course OR_CONJ COURSE_NUMBER
Rule 4     statement -> or_phrase
Rule 5     course_data -> course
Rule 6     course_data -> course course_list_tail
Rule 7     course_list_tail -> , COURSE_NUMBER
Rule 8     course_list_tail -> , COURSE_NUMBER course_list_tail
Rule 9     course -> DEPT_CODE COURSE_NUMBER

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

妞丶爷亲个 2024-09-10 08:30:32

您的基本问题是,您需要两个前瞻令牌才能完成您想要的操作 - 当到目前为止看到的输入是一个 course 而前瞻是一个 OR_CONJ 时,您不需要不知道是否将course减少为course_data或在不向前查看两个标记的情况下转移到OR_CONJ之后的标记。有多种方法可以处理这个问题

  • 使用 LR(2) 或 LR(k) 或 GLR 解析器生成器——任何方法都可以处理这个问题。

  • 使用词法分析器进行前瞻 - 基本上让词法分析器返回两个不同的 OR_CONJ 标记,具体取决于以下标记是否是 COURSE_NUMBER

  • 分解语法以消除冲突,这可能会导致语法解析的内容与您想要的略有不同(需要一些额外的解析后检查以拒绝一些无效的构造),并且通常会使语法变得更加困难理解。

请注意,您给出的语法对于三个或更多课程在单个语句关联中的连接方式也是不明确的。通过将语法重写为更清晰的左递归形式,可以轻松解决此问题:

Rule 1    statement -> course
Rule 2    statement -> statement OR_CONJ course
Rule 3    course -> DEPT_CODE course_list
Rule 4    course -> DEPT CODE course_list OR_CONJ COURSE_NUMBER
Rule 5    course_list -> COURSE_NUMBER
Rule 6    course_list -> course_list , COURSE_NUMBER

对于 LL 解析器生成器,也可以将其重写为右递归形式,但它仍然存在 2 令牌先行问题。重构它以消除这种情况的一种方法是将 COURSE_NUMBER 本身设置为有效的课程,并将其与之前的课程重新组合在一个post-pass(或者如果它是 statement 中的第一个 course,则给出错误)。那么规则 4 就变成了:

Rule 4    course -> COURSE_NUMBER

你们之间就没有冲突了。

Your basic problem is that you need two tokens of lookahead to do what you want -- when the input seen so far is a course and the lookahead is a OR_CONJ you don't know whether to reduce the course to a course_data or shift without looking ahead two tokens to the token after the OR_CONJ. There are a number of ways you can deal with this

  • use an LR(2) or LR(k) or GLR parser generator -- any can deal with this.

  • use a lexer hack to do the lookahead -- basically have the lexer return two different OR_CONJ tokens depending on whether the following token is a COURSE_NUMBER or not.

  • factor the grammar to get rid of the conflict, which may result in a grammar that parses something slightly different from what you want (need some extra post-parse checks to reject some invalid constructs) and will generally make the grammar much harder to understand.

Note that your grammar as given is also ambiguous related to which way three or more courses connected in a single statement associate. This is easily fixed by rewriting the grammar into a clearer left-recursive form:

Rule 1    statement -> course
Rule 2    statement -> statement OR_CONJ course
Rule 3    course -> DEPT_CODE course_list
Rule 4    course -> DEPT CODE course_list OR_CONJ COURSE_NUMBER
Rule 5    course_list -> COURSE_NUMBER
Rule 6    course_list -> course_list , COURSE_NUMBER

This could also be rewritten as right-recursive for an LL parser generator, but it still has the 2-token lookahead problem. One way of refactoring it to make that go away would be to make COURSE_NUMBER by itself a valid course and recombine it with the previous course in a post-pass (or give an error if its the first course in a statement). Then rule 4 becomes:

Rule 4    course -> COURSE_NUMBER

and you have no conflicts.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文