了解潜在的冲突
我有一个由 Menhir 构建的小型表达式解析器。我试图通过在 parser.mly 中编写恢复语法来在解析期间恢复括号不完整表达式:
%{
open AST
%}
%token<int> LINT
%token<string> ID
%token LPAREN RPAREN COMMA
%token EOF PLUS STAR EQ
%start<AST.expression> expressionEOF
%right LPAREN RPAREN
%nonassoc EQ
%left PLUS
%left STAR
%%
expressionEOF: e=expression EOF
{
e
}
expression:
| x=LINT
{
Int x
}
| x=identifier
{
Read x
}
| e1=expression b=binop e2=expression
{
Binop (b, e1, e2)
}
| e1=expression b=binop
(* for "2+", "2*3+" *)
{
Binop (b, e1, FakeExpression)
}
| LPAREN e=expression RPAREN
{
Paren e
}
| LPAREN RPAREN
(* for "()" *)
{
Paren FakeExpression
}
| LPAREN
(* for "(" *)
{
ParenMissingRparen FakeExpression
}
| LPAREN e=expression
(* for "(1", "(1+2", "(1+2*3", "((1+2)" *)
{
ParenMissingRparen e
}
| RPAREN
(* for ")" *)
{
ExtraRparen FakeExpression
}
| e=expression RPAREN
(* for "3)", "4))", "2+3)" *)
{
ExtraRparen e
}
%inline binop:
PLUS { Add }
| STAR { Mul }
| EQ { Equal }
identifier: x=ID
{
Id x
}
它在一组不完整表达式上运行良好。但是,menhir --explain parser.mly 返回以下parser.conflict:
** Conflict (reduce/reduce) in state 10.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:
LPAREN expression RPAREN
** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)
expressionEOF
expression EOF
expression STAR expression // lookahead token appears
(?)
** In state 10, looking ahead at STAR, reducing production
** expression -> LPAREN expression RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression RPAREN .
** In state 10, looking ahead at STAR, reducing production
** expression -> expression RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression // lookahead token is inherited
expression RPAREN .
** Conflict (reduce/reduce) in state 3.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:
LPAREN RPAREN
** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)
expressionEOF
expression EOF
expression STAR expression // lookahead token appears
(?)
** In state 3, looking ahead at STAR, reducing production
** expression -> LPAREN RPAREN
** is permitted because of the following sub-derivation:
LPAREN RPAREN .
** In state 3, looking ahead at STAR, reducing production
** expression -> RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression // lookahead token is inherited
RPAREN .
我不明白它试图解释什么。谁能告诉我可能存在哪些潜在冲突(优先举例)以及解决方案是什么?
I have a small parser of expression built by Menhir. I'm trying to recover parenthesis-incomplete expressions during parsing by writing recovery grammars in parser.mly
:
%{
open AST
%}
%token<int> LINT
%token<string> ID
%token LPAREN RPAREN COMMA
%token EOF PLUS STAR EQ
%start<AST.expression> expressionEOF
%right LPAREN RPAREN
%nonassoc EQ
%left PLUS
%left STAR
%%
expressionEOF: e=expression EOF
{
e
}
expression:
| x=LINT
{
Int x
}
| x=identifier
{
Read x
}
| e1=expression b=binop e2=expression
{
Binop (b, e1, e2)
}
| e1=expression b=binop
(* for "2+", "2*3+" *)
{
Binop (b, e1, FakeExpression)
}
| LPAREN e=expression RPAREN
{
Paren e
}
| LPAREN RPAREN
(* for "()" *)
{
Paren FakeExpression
}
| LPAREN
(* for "(" *)
{
ParenMissingRparen FakeExpression
}
| LPAREN e=expression
(* for "(1", "(1+2", "(1+2*3", "((1+2)" *)
{
ParenMissingRparen e
}
| RPAREN
(* for ")" *)
{
ExtraRparen FakeExpression
}
| e=expression RPAREN
(* for "3)", "4))", "2+3)" *)
{
ExtraRparen e
}
%inline binop:
PLUS { Add }
| STAR { Mul }
| EQ { Equal }
identifier: x=ID
{
Id x
}
It works fine on a set of incomplete expressions. However, menhir --explain parser.mly
returns the following parser.conflict
:
** Conflict (reduce/reduce) in state 10.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:
LPAREN expression RPAREN
** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)
expressionEOF
expression EOF
expression STAR expression // lookahead token appears
(?)
** In state 10, looking ahead at STAR, reducing production
** expression -> LPAREN expression RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression RPAREN .
** In state 10, looking ahead at STAR, reducing production
** expression -> expression RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression // lookahead token is inherited
expression RPAREN .
** Conflict (reduce/reduce) in state 3.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:
LPAREN RPAREN
** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)
expressionEOF
expression EOF
expression STAR expression // lookahead token appears
(?)
** In state 3, looking ahead at STAR, reducing production
** expression -> LPAREN RPAREN
** is permitted because of the following sub-derivation:
LPAREN RPAREN .
** In state 3, looking ahead at STAR, reducing production
** expression -> RPAREN
** is permitted because of the following sub-derivation:
LPAREN expression // lookahead token is inherited
RPAREN .
I don't understand what it tries to explain. Could anyone tell me what may be potential conflicts (with example by preference) and what would be solutions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您有:
因此,您希望
( x )
匹配第一条规则:它确实如此。但它也以另一种方式匹配:
它也像这样匹配:
因为你还让
expr
匹配(
和)
,( )
还可以通过多种方式进行匹配,包括作为expr ')'
(使用expr -> '('
) 或'(' expr
(带有expr -> ')'
)。“解决方案”是放弃尝试添加无效句子的识别,一旦语法错误,您可以尝试使用 Menhir 的错误恢复机制来产生错误。消息并继续解析。
You have:
So, you want
( x )
to match the first rule:Which it does. But it also matches another way:
And it also matches like this:
Since you also let
expr
match(
and)
,( )
can also be matched several ways, including asexpr ')'
(withexpr -> '('
), or'(' expr
(withexpr -> ')'
).The "solution" is to give up trying to add recognition of invalid sentences. The parse should fail on a syntax error; once it fails you can try to use Menhir's error recovery mechanism to produce an error message and continue the parse. See section 11 of the manual.