如何构造 yacc 代码以支持可选的非终结符
在 yacc 中对可选数据建模的最佳方法是什么?我有以下声明:
StmtBlock : '{' VariableDeclList StmtList '}' { $$ = new StmtBlock($2, $3); }
;
VariableDeclList 和 StmtList 都是可选的(epsilon),因此我将它们建模如下:
VariableDeclList : VariableDeclList VariableDecl { ($$=$1)->Append($2); }
| { $$ = new List<VarDecl*>; }
。
StmtList : StmtList Stmt { ($$=$1)->Append($2); }
| { $$ = new List<Stmt*>; }
;
唯一的问题是当我认为这会导致移位/归约冲突时 当我尝试编译代码时,我的 y.ouput 文件具有以下内容:
State 74 conflicts: 1 shift/reduce
...
state 74
38 StmtBlock: '{' VariableDeclList . StmtList '}'
39 VariableDeclList: VariableDeclList . VariableDecl
T_Bool shift, and go to state 2
T_Int shift, and go to state 3
T_Double shift, and go to state 4
T_String shift, and go to state 5
T_Identifier shift, and go to state 8
T_Identifier [reduce using rule 18 (Epsilon)]
$default reduce using rule 18 (Epsilon)
VariableDecl go to state 80
Variable go to state 13
Type go to state 34
Epsilon go to state 81
StmtList go to state 82
...
是否有更合适的方法来对此进行建模?
What is the best way to model optional data in yacc? I have the following statement:
StmtBlock : '{' VariableDeclList StmtList '}' { $ = new StmtBlock($2, $3); }
;
Both, VariableDeclList and StmtList are optional (epsilon) so I modeled them as follows:
VariableDeclList : VariableDeclList VariableDecl { ($=$1)->Append($2); }
| { $ = new List<VarDecl*>; }
and
StmtList : StmtList Stmt { ($=$1)->Append($2); }
| { $ = new List<Stmt*>; }
;
The only problem is when I think this causes shift/reduce conflicts. When I try to compile my code my y.ouput file has the following:
State 74 conflicts: 1 shift/reduce
...
state 74
38 StmtBlock: '{' VariableDeclList . StmtList '}'
39 VariableDeclList: VariableDeclList . VariableDecl
T_Bool shift, and go to state 2
T_Int shift, and go to state 3
T_Double shift, and go to state 4
T_String shift, and go to state 5
T_Identifier shift, and go to state 8
T_Identifier [reduce using rule 18 (Epsilon)]
$default reduce using rule 18 (Epsilon)
VariableDecl go to state 80
Variable go to state 13
Type go to state 34
Epsilon go to state 81
StmtList go to state 82
...
Is there a more proper way to model this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如您所注意到的,如果 Stmt 和 VariableDecl 无法通过它们的第一个标记来区分,这将导致移位/归约冲突 - 在您的示例中是
T_Identifier
可能会开始。您可以尝试多种方法:将两个列表组合成一个 VariableDeclOrStmtList,其中可以包含混合的 Stmt 或 VariableDecl反正。 之后不存在 VariableDecls
使用更多前瞻来确定差异——要么使用 bison 的 GLR 模式,要么使用 btyacc 之类的东西,可以推迟确定某物是否是Stmt 或 VariableDecl 直到它看到足够的信息来决定
在这些结构的开头使用不同的标记来明确区分它们。这些可以是您语言中的真实标记,也可以是词法分析器根据某些附加上下文插入的合成标记(这意味着基本上在词法分析器中进行更多的前瞻操作)。
编辑
对于第一个建议,您将拥有如下规则:
其中
Node
是VarDecl
和Stmt< 的公共基类/代码>。然后,在 StmtBlock 操作中,调用一个遍历
List
的函数,将其拆分为List
和List< ;Stmt *>
并在格式不正确时给出错误消息。As you note, this will cause shift/reduce conflicts if a Stmt and a VariableDecl can't be distinguished by their first token -- in your example a
T_Identifier
might begin either. There are a number of things you can try:combine the two lists into a VariableDeclOrStmtList which can contain either Stmt or VariableDecl intermixed in any way. This is more general than what you have, so you'll need a post-check to ensure that there are no VariableDecls after the first Stmt
Use more lookahead to determine the difference -- either use bison's GLR mode, or something like btyacc that can defer deciding whether something is a Stmt or VariableDecl until it has seen enough to decide
Use different tokens at the beginning of these constructs to unambiguously differentiate them. These can be either real tokens in your language, or synthetic tokens that the lexer inserts based on some additional context (which means basically doing more lookahead in the lexer).
edit
For the first suggestion, you'll have a rule like:
where
Node
is a common base class ofVarDecl
andStmt
. Then, in your StmtBlock action, call a function that goes through theList<Node *>
, splitting it into aList<VarDecl *>
andList<Stmt *>
and giving an error message if it is not well-formed.