Happy/YACC 在应该转变时减少

发布于 2024-10-18 10:40:09 字数 2100 浏览 2 评论 0原文

我正在研究解析器,我真的很沮丧。在该语言中,我们可以有这样的表达式:

new int[3][][]

new int[3]

大部分都能正确解析,除了末尾的空数组。 在我的解析器中,我有:

Expression : int
             char
             null
             (...many others...)
             new NewExpression

然后 NewExpression 是:

NewExpression : NonArrayType '[' Expression ']' EmptyArrays
              | NonArrayType '[' Expression ']' 

然后 EmptyArrays 是一个或多个空大括号 - 如果 EmptyArrays 派生空字符串,它会添加 20 个移位/归约冲突:

EmptyArrays : EmptyArrays EmptyArray
            | EmptyArray
EmptyArray  : '[' ']'

但是,当我查看 .info< /code> 文件解析器,我得到这个:

State 214¬
¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace . EmptyArrays    (rule 80)¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace .    (rule 81)¬
¬
▸   dot            reduce using rule 81¬
▸   ';'            reduce using rule 81¬
▸   ','            reduce using rule 81¬
▸   '+'            reduce using rule 81¬
▸   '-'            reduce using rule 81¬
▸   '*'            reduce using rule 81¬
▸   '/'            reduce using rule 81¬
▸   '<'            reduce using rule 81¬
▸   '>'            reduce using rule 81¬
▸   '<='           reduce using rule 81¬
▸   '>='           reduce using rule 81¬
▸   '=='           reduce using rule 81¬
▸   '!='           reduce using rule 81¬
▸   ')'            reduce using rule 81¬
▸   '['            reduce using rule 81    --I expect this should shift
▸   ']'            reduce using rule 81¬
▸   '?'            reduce using rule 81¬
▸   ':'            reduce using rule 81¬
▸   '&&'           reduce using rule 81¬
▸   '||'           reduce using rule 81

我希望如果我们处于状态 214 并且我们看到左大括号,我们应该将其移到堆栈上并继续解析 EmptyArrays。

我不太确定发生了什么,因为当我通过使用 NewExpression 开始解析来去除行李中的所有多余内容时(例如),附加括号会正确解析。表达式、语句或语法中的任何非终结符不可能以左大括号开头。特别是因为我对 if/else 语句有类似的规则,它会产生移位/归约冲突,但如果下一个标记是 else 则选择移位(这个问题有详细记录)。

你能帮我弄清楚出了什么问题吗?我真的很感谢你的帮助,我真的很想解决这个问题。

I'm working on a parser and I'm really frustrated. In the language, we can have an expression like:

new int[3][][]

or

new int[3]

Most of it parses correctly, except for the empty arrays at the end.
In my parser I have:

Expression : int
             char
             null
             (...many others...)
             new NewExpression

and then a NewExpression is:

NewExpression : NonArrayType '[' Expression ']' EmptyArrays
              | NonArrayType '[' Expression ']' 

and then EmptyArrays is one or more empty braces - if EmptyArrays derives the empty string, it adds 20 shift/reduce conflicts:

EmptyArrays : EmptyArrays EmptyArray
            | EmptyArray
EmptyArray  : '[' ']'

However, when I look in the .info file for the parser, I get this:

State 214¬
¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace . EmptyArrays    (rule 80)¬
▸   NewExpression -> NonArrayType lbrace Expression rbrace .    (rule 81)¬
¬
▸   dot            reduce using rule 81¬
▸   ';'            reduce using rule 81¬
▸   ','            reduce using rule 81¬
▸   '+'            reduce using rule 81¬
▸   '-'            reduce using rule 81¬
▸   '*'            reduce using rule 81¬
▸   '/'            reduce using rule 81¬
▸   '<'            reduce using rule 81¬
▸   '>'            reduce using rule 81¬
▸   '<='           reduce using rule 81¬
▸   '>='           reduce using rule 81¬
▸   '=='           reduce using rule 81¬
▸   '!='           reduce using rule 81¬
▸   ')'            reduce using rule 81¬
▸   '['            reduce using rule 81    --I expect this should shift
▸   ']'            reduce using rule 81¬
▸   '?'            reduce using rule 81¬
▸   ':'            reduce using rule 81¬
▸   '&&'           reduce using rule 81¬
▸   '||'           reduce using rule 81

I expect though that if we're in state 214 and we see a left brace, we should shift it onto the stack and continue to parse EmptyArrays.

I'm not exactly sure what is going on because when I strip all of the excess out of the baggage (eg) by starting the parse with NewExpression, the additional brackets parse correctly. It's not possible for an Expression or a Statement or any non-terminal in the grammar to start with a left brace. Especially because I have a similar rule for if/else statements, which generates a shift/reduce conflict, but chooses to shift if the next token is an else (this problem is well documented).

Can you help me figure out what is going wrong? I really appreciate your help, I am really tilting at windmills trying to figure out the problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

单挑你×的.吻 2024-10-25 10:40:09

您可能为“[”和/或“]”设置了优先级,例如 %left '[' ,这会导致此行为。删除该优先级声明,这将揭示您在这里遇到的移位/归约冲突。至于为什么会发生移位/归约冲突,您可能也有一个规则:

Expression: Expression '[' Expression ']'

对于数组访问。问题是,由于 NewExpression 是一个 Expression 它后面可能跟着这样的索引,并且当查看 '[' 的前瞻时,它无法分辨无论这是索引表达式的开头还是 EmptyArray 的开头——都需要 2 个令牌先行。

对于这种特定情况,您可以尝试的一件事是让您的词法分析器执行此处所需的额外前瞻,并将 [] 识别为单个标记。

You probably have a precedence set for '[' and/or ']' with something like %left '[' which causes this behavior. Remove that precedence declaration, and this will reveal the shift/reduce conflict you have here. As for why its a shift/reduce conflict, you probably also have a rule:

Expression: Expression '[' Expression ']'

for an array access. The problem being that since a NewExpression is an Expression it may be followed by an index like this, and when looking at the lookahead of '[', it can't tell whether that's the beginning of an index expression or the beginning of an EmptyArray -- that would require 2-token lookahead.

One thing you could try for this specific case would be to have your lexer do the extra lookahead needed here and recognize [] as a single token.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文