Happy/YACC 在应该转变时减少

发布于 2024-10-18 10:40:09 字数 2100 浏览 6 评论 0原文

我正在研究解析器，我真的很沮丧。在该语言中，我们可以有这样的表达式：

new int[3][][]

或

new int[3]

大部分都能正确解析，除了末尾的空数组。在我的解析器中，我有：

Expression : int
             char
             null
             (...many others...)
             new NewExpression

然后 NewExpression 是：

NewExpression : NonArrayType '[' Expression ']' EmptyArrays
              | NonArrayType '[' Expression ']'

然后 EmptyArrays 是一个或多个空大括号 - 如果 EmptyArrays 派生空字符串，它会添加 20 个移位/归约冲突：

EmptyArrays : EmptyArrays EmptyArray
            | EmptyArray
EmptyArray  : '[' ']'

但是，当我查看 .info< /code> 文件解析器，我得到这个：

State 214¬
¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace . EmptyArrays    (rule 80)¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace .    (rule 81)¬
¬
▸   dot            reduce using rule 81¬
▸   ';'            reduce using rule 81¬
▸   ','            reduce using rule 81¬
▸   '+'            reduce using rule 81¬
▸   '-'            reduce using rule 81¬
▸   '*'            reduce using rule 81¬
▸   '/'            reduce using rule 81¬
▸   '<'            reduce using rule 81¬
▸   '>'            reduce using rule 81¬
▸   '<='           reduce using rule 81¬
▸   '>='           reduce using rule 81¬
▸   '=='           reduce using rule 81¬
▸   '!='           reduce using rule 81¬
▸   ')'            reduce using rule 81¬
▸   '['            reduce using rule 81    --I expect this should shift
▸   ']'            reduce using rule 81¬
▸   '?'            reduce using rule 81¬
▸   ':'            reduce using rule 81¬
▸   '&&'           reduce using rule 81¬
▸   '||'           reduce using rule 81

我希望如果我们处于状态 214 并且我们看到左大括号，我们应该将其移到堆栈上并继续解析 EmptyArrays。

我不太确定发生了什么，因为当我通过使用 NewExpression 开始解析来去除行李中的所有多余内容时（例如），附加括号会正确解析。表达式、语句或语法中的任何非终结符不可能以左大括号开头。特别是因为我对 if/else 语句有类似的规则，它会产生移位/归约冲突，但如果下一个标记是 else 则选择移位（这个问题有详细记录）。

你能帮我弄清楚出了什么问题吗？我真的很感谢你的帮助，我真的很想解决这个问题。

原文

I'm working on a parser and I'm really frustrated. In the language, we can have an expression like:

new int[3][][]

new int[3]

Most of it parses correctly, except for the empty arrays at the end.
In my parser I have:

Expression : int
             char
             null
             (...many others...)
             new NewExpression

and then a NewExpression is:

NewExpression : NonArrayType '[' Expression ']' EmptyArrays
              | NonArrayType '[' Expression ']'

and then EmptyArrays is one or more empty braces - if EmptyArrays derives the empty string, it adds 20 shift/reduce conflicts:

EmptyArrays : EmptyArrays EmptyArray
            | EmptyArray
EmptyArray  : '[' ']'

However, when I look in the .info file for the parser, I get this:

State 214¬
¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace . EmptyArrays    (rule 80)¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace .    (rule 81)¬
¬
▸   dot            reduce using rule 81¬
▸   ';'            reduce using rule 81¬
▸   ','            reduce using rule 81¬
▸   '+'            reduce using rule 81¬
▸   '-'            reduce using rule 81¬
▸   '*'            reduce using rule 81¬
▸   '/'            reduce using rule 81¬
▸   '<'            reduce using rule 81¬
▸   '>'            reduce using rule 81¬
▸   '<='           reduce using rule 81¬
▸   '>='           reduce using rule 81¬
▸   '=='           reduce using rule 81¬
▸   '!='           reduce using rule 81¬
▸   ')'            reduce using rule 81¬
▸   '['            reduce using rule 81    --I expect this should shift
▸   ']'            reduce using rule 81¬
▸   '?'            reduce using rule 81¬
▸   ':'            reduce using rule 81¬
▸   '&&'           reduce using rule 81¬
▸   '||'           reduce using rule 81

I expect though that if we're in state 214 and we see a left brace, we should shift it onto the stack and continue to parse EmptyArrays.

I'm not exactly sure what is going on because when I strip all of the excess out of the baggage (eg) by starting the parse with NewExpression, the additional brackets parse correctly. It's not possible for an Expression or a Statement or any non-terminal in the grammar to start with a left brace. Especially because I have a similar rule for if/else statements, which generates a shift/reduce conflict, but chooses to shift if the next token is an else (this problem is well documented).

Can you help me figure out what is going wrong? I really appreciate your help, I am really tilting at windmills trying to figure out the problem.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

单挑你×的.吻 2024-10-25 10:40:09

您可能为“[”和/或“]”设置了优先级，例如 %left '[' ，这会导致此行为。删除该优先级声明，这将揭示您在这里遇到的移位/归约冲突。至于为什么会发生移位/归约冲突，您可能也有一个规则：

Expression: Expression '[' Expression ']'

对于数组访问。问题是，由于 NewExpression 是一个 Expression 它后面可能跟着这样的索引，并且当查看 '[' 的前瞻时，它无法分辨无论这是索引表达式的开头还是 EmptyArray 的开头——都需要 2 个令牌先行。

对于这种特定情况，您可以尝试的一件事是让您的词法分析器执行此处所需的额外前瞻，并将 [] 识别为单个标记。

You probably have a precedence set for '[' and/or ']' with something like %left '[' which causes this behavior. Remove that precedence declaration, and this will reveal the shift/reduce conflict you have here. As for why its a shift/reduce conflict, you probably also have a rule:

Expression: Expression '[' Expression ']'

for an array access. The problem being that since a NewExpression is an Expression it may be followed by an index like this, and when looking at the lookahead of '[', it can't tell whether that's the beginning of an index expression or the beginning of an EmptyArray -- that would require 2-token lookahead.

One thing you could try for this specific case would be to have your lexer do the extra lookahead needed here and recognize [] as a single token.

回复收藏 0 原文

~没有更多了~