如何构造 yacc 代码以支持可选的非终结符

发布于 2024-10-15 20:08:23 字数 1353 浏览 3 评论 0原文

在 yacc 中对可选数据建模的最佳方法是什么?我有以下声明:

StmtBlock           :    '{' VariableDeclList StmtList '}' {  $$ = new StmtBlock($2, $3); }
                    ;

VariableDeclList 和 StmtList 都是可选的(epsilon),因此我将它们建模如下:

VariableDeclList    :    VariableDeclList VariableDecl  { ($$=$1)->Append($2); }
                    |    { $$ = new List<VarDecl*>; }

StmtList            :    StmtList Stmt { ($$=$1)->Append($2); }
                    |    { $$ = new List<Stmt*>; }
                    ;

唯一的问题是当我认为这会导致移位/归约冲突时 当我尝试编译代码时,我的 y.ouput 文件具有以下内容:

State 74 conflicts: 1 shift/reduce
...
state 74

   38 StmtBlock: '{' VariableDeclList . StmtList '}'
   39 VariableDeclList: VariableDeclList . VariableDecl

    T_Bool        shift, and go to state 2
    T_Int         shift, and go to state 3
    T_Double      shift, and go to state 4
    T_String      shift, and go to state 5
    T_Identifier  shift, and go to state 8

    T_Identifier  [reduce using rule 18 (Epsilon)]
    $default      reduce using rule 18 (Epsilon)

    VariableDecl  go to state 80
    Variable      go to state 13
    Type          go to state 34
    Epsilon       go to state 81
    StmtList      go to state 82
...

是否有更合适的方法来对此进行建模?

What is the best way to model optional data in yacc? I have the following statement:

StmtBlock           :    '{' VariableDeclList StmtList '}' {  $ = new StmtBlock($2, $3); }
                    ;

Both, VariableDeclList and StmtList are optional (epsilon) so I modeled them as follows:

VariableDeclList    :    VariableDeclList VariableDecl  { ($=$1)->Append($2); }
                    |    { $ = new List<VarDecl*>; }

and

StmtList            :    StmtList Stmt { ($=$1)->Append($2); }
                    |    { $ = new List<Stmt*>; }
                    ;

The only problem is when I think this causes shift/reduce conflicts. When I try to compile my code my y.ouput file has the following:

State 74 conflicts: 1 shift/reduce
...
state 74

   38 StmtBlock: '{' VariableDeclList . StmtList '}'
   39 VariableDeclList: VariableDeclList . VariableDecl

    T_Bool        shift, and go to state 2
    T_Int         shift, and go to state 3
    T_Double      shift, and go to state 4
    T_String      shift, and go to state 5
    T_Identifier  shift, and go to state 8

    T_Identifier  [reduce using rule 18 (Epsilon)]
    $default      reduce using rule 18 (Epsilon)

    VariableDecl  go to state 80
    Variable      go to state 13
    Type          go to state 34
    Epsilon       go to state 81
    StmtList      go to state 82
...

Is there a more proper way to model this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

滴情不沾 2024-10-22 20:08:23

正如您所注意到的,如果 StmtVariableDecl 无法通过它们的第一个标记来区分,这将导致移位/归约冲突 - 在您的示例中是 T_Identifier 可能会开始。您可以尝试多种方法:

  • 将两个列表组合成一个 VariableDeclOrStmtList,其中可以包含混合的 StmtVariableDecl反正。 之后不存在 VariableDecls

  • 使用更多前瞻来确定差异——要么使用 bison 的 GLR 模式,要么使用 btyacc 之类的东西,可以推迟确定某物是否是StmtVariableDecl 直到它看到足够的信息来决定

  • 在这些结构的开头使用不同的标记来明确区分它们。这些可以是您语言中的真实标记,也可以是词法分析器根据某些附加上下文插入的合成标记(这意味着基本上在词法分析器中进行更多的前瞻操作)。

编辑

对于第一个建议,您将拥有如下规则:

VarDeclOrStmtList : VarDeclOrStmtList VariableDecl { ($=$1)->Append($2); }
                  | VarDeclOrStmtList Stmt         { ($=$1)->Append($2); }
                  | { $ = new List<Node *>; }
                  ;

其中 NodeVarDeclStmt< 的公共基类/代码>。然后,在 StmtBlock 操作中,调用一个遍历 List 的函数,将其拆分为 ListList< ;Stmt *> 并在格式不正确时给出错误消息。

As you note, this will cause shift/reduce conflicts if a Stmt and a VariableDecl can't be distinguished by their first token -- in your example a T_Identifier might begin either. There are a number of things you can try:

  • combine the two lists into a VariableDeclOrStmtList which can contain either Stmt or VariableDecl intermixed in any way. This is more general than what you have, so you'll need a post-check to ensure that there are no VariableDecls after the first Stmt

  • Use more lookahead to determine the difference -- either use bison's GLR mode, or something like btyacc that can defer deciding whether something is a Stmt or VariableDecl until it has seen enough to decide

  • Use different tokens at the beginning of these constructs to unambiguously differentiate them. These can be either real tokens in your language, or synthetic tokens that the lexer inserts based on some additional context (which means basically doing more lookahead in the lexer).

edit

For the first suggestion, you'll have a rule like:

VarDeclOrStmtList : VarDeclOrStmtList VariableDecl { ($=$1)->Append($2); }
                  | VarDeclOrStmtList Stmt         { ($=$1)->Append($2); }
                  | { $ = new List<Node *>; }
                  ;

where Node is a common base class of VarDecl and Stmt. Then, in your StmtBlock action, call a function that goes through the List<Node *>, splitting it into a List<VarDecl *> and List<Stmt *> and giving an error message if it is not well-formed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文