Bison 风格：使用我自己的堆栈不好吗？全局变量是坏的吗？

发布于 2025-01-06 05:38:59 字数 1122 浏览 3 评论 0原文

我的问题基本上是“YACC / Bison 中的良好风格是什么？”与此相关的是，我是否让 Bison 做它擅长的事情。

例如，我发现我的 Bison 程序比我的原始代码更加依赖全局变量。考虑以下事项：

 prog : 
      vents{ /*handle semantics...*/ } 
      unity{ /*handle semantics...*/ } 
      defs;

如果我想在“vents”和“unity”之后的两个大括号分隔的块之间传递信息，我认为使用全局变量（从技术上讲，具有文件级作用域和内部链接的变量）是从信息隐藏的角度来看，我能做的就是最好的。我在这些块中声明的任何变量都是其块的本地变量（我认为...），而我可以将 C++ 声明结果放在文件级作用域中的其他指定位置。

如果我可以将变量声明注入到“yyparse()”函数中，这将更适合我的需求。是否有此类代码的挂钩，或者是否有其他方式来注入此类变量？或者全局变量只是使用 Bison 的一个可接受的部分？

我还想到，也许我不应该以这种方式在这些部分之间传递信息。但仅使用 $$、$1、$2 等传递所有内容对我来说似乎很困难。我只是没有“明白”吗？

我发现我的全局变量之一特别有问题，即使我接受其余的变量。它是 std::stack 类型，与输入语言对条件的支持有关。

当我在编译器输入中遇到条件（“if/else”）时，这会导致最终发出三个汇编语言标签，其中包含一个文本字符串，后跟从序列中提取的数字。

因此，当我第一次遇到“if”时，我会获取一个序列号，将其推入堆栈（因为“if”结构可以嵌套），然后稍后使用它（通过“peeks”或“pops”）来构造必要的标签和跳转，例如在我的条件、我的“if”块和我的“else”块之后。

我尝试使用 $-2 之类的东西来完成这项工作，但发现这个标识符与我的条件的开头无关，而是与刚刚编译的任何块的结尾有关。 $ 抽象出来的系统似乎属于从左到右阅读的代码，没有任何关于其中的结构如何嵌套的概念。

我不指望你们都能帮我完成这项工作...但是我至少在尝试使用 $$、$1、$-1 等方面走在正确的道路上吗？很可能我放弃得太早了，并且/或者我会从采取干净的方法中受益，即完全扔掉我的旧临时代码。

是这样吗？或者我的 std:stack 及其全局变量的组合方法可以吗？

原文

My question is basically "what constitutes good style in YACC / Bison?" and, relatedly, whether or not I am letting Bison do the things it is good at.

For example, I find that my Bison program relies to a greater extent on globals than my original code did. Consider the following:

 prog : 
      vents{ /*handle semantics...*/ } 
      unity{ /*handle semantics...*/ } 
      defs;

If I want to pass information between the two curly brace-delimited blocks after "vents" and "unity", I think that using a global variable (technically, a variable with file-level scope and internal linkage) is the best I can do from an information-hiding standpoint. Any variable I declare within these blocks is local to its block (I think...), and the other designated places I can put C++ declarations result in file-level scope.

If I could inject a variable declaration into the "yyparse()" function, this would better suit my needs. Is there a hook for this sort of code, or some other way to inject such a variable? Or are globals just an accepted part of using Bison?

It has also occurred to me that maybe I am not supposed to even want to pass information between these sections in this way. But passing everything around using just $$, $1, $2, etc. seems difficult to me. Am I just not "getting it"?

I find one of my global variables to be particularly questionable even if I accept the rest of them. It is of type std::stack and relates to the input language's support for conditionals.

When I encounter a conditional ("if/else") in my compiler input, this results in the eventual emission of three assembly language labels, comprised of a text string followed by a number pulled from the sequence.

So, I am acquiring a sequence number when I first encounter the "if", pushing it onto a stack (since "if" structures can be nested), and then using it later (via "peeks" or "pops") to construct the necessary labels and jumps, e.g after my condition, my "if" block, and my "else" block.

I tried to make this work using something like $-2 instead, but found that this identifer pertained not to the beginning of my conditional, but to the end of whatever block was just compiled. The system abstracted over by $ seems to pertain to the code as read from left to right, without any concept of how the structures within it are nested.

I don't expect you all to make this work for me... but was I at least on the right path with trying to use $$, $1, $-1, and so on? It's quite possible I just gave up too soon, and/or that I would benefit from taking a clean sheet approach, i.e. from throwing out my old ad hoc code altogether.

Is that the case? Or is my thrown-together approach with its std:stack and its globals OK?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

自由范儿 2025-01-13 05:38:59

我没有看到避免使用全局变量的困难，我几乎不使用它们来表示错误或类似的事情。

想想解析器，它应该产生什么？抽象语法树..

它是如何制作的？它是一棵 n 叉树，其中每个节点都包含一些信息及其子节点，因此不需要全局变量。

我将让您看一下我正在编写的语言，只是为了给您一个想法：

bexp:
  bexp T_PLUS bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_MINUS bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_TIMES bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_DIV bexp { $ = new ASTBExp($2,$1,$3); }

uexp:
  raw_value { $ = $1; }
  | UOP_NOT uexp { $ = new ASTUExp($1,$2); }
  | T_LPAREN bexp T_LPAREN { $ = $2; }
  | var_ref { $ = new ASTVarRef((ASTIdentifier*)$1); }
  | call { $ = $1; }

正如您所看到的，解析的每个节点都是用语法的子节点实例化的，这些节点在语义上也是抽象语法树的子节点，并以$$

根元素类似于

start: root { Compiler::instance()->setAST((ASTRoot*)$1); }
;

root:
  function_list { $ = new ASTRoot($1); }
;

我获取整个树并将其传递给我的 Compiler 类的实例。

现在，如果您查看调用 yyparse() 的函数，

bool parseSource()
{
  //yydebug = 1;
  freopen(fileName, "r", stdin);
  yyparse();

  return !failed;
}

我只需打开一个文件并调用解析例程。该函数由这里的 Compiler 类调用：

  bool compile()
  {
    if (!parseSource())
      return false;

    if (!populateFunctionsTable())
      return false;

    ast->recursivePrint(0);
    Utils::switchStdout(binaryFile);
    ast->generateASM();
    Utils::revertStdout();

    assemble();

    return true;
  }

正如您所看到的，解析例程被调用，该例程创建整个树，然后将其设置在 Compiler 类中。对树的递归访问（函数generateASM）完成了这项肮脏的工作。

我希望这能澄清您应该如何使用解析器，如果您需要任何进一步的信息，请告诉我..您不需要在解析器中完成所有工作。只需在那里进行解析即可，其他一切都可以通过抽象语法树上的一些递归调用来解决。

另一个实际的例子是您正在讨论的 if/else 语句，在语法中它被定义为

if_stat:
  KW_IF T_LPAREN exp T_RPAREN block %prec LOWER_THAN_ELSE { $ = new ASTIfStat($3, $5); }
  | KW_IF T_LPAREN exp T_RPAREN block KW_ELSE block { $ = new ASTIfStat($3, $5, $7); }
;

创建一个特殊节点，用于管理 if/else 构造，然后只需使用此 generateASM功能：

 void generateASM()
  { 
    if (m_fbody == NULL)
    {
      m_condition->generateASM();
      printf("NOT\n");
      printf("JUMPC iflabel%u\n", labelCounter);
      m_tbody->generateASM();
      printf("iflabel%u:\r\n", labelCounter);

      ++labelCounter;
    }
    else
    {
      u32 c = labelCounter++;
      u32 d = labelCounter++;

      m_condition->generateASM();
      printf("JUMPC iflabel%u\n", c);
      m_fbody->generateASM();
      printf("JUMP iflabel%u\n", d);
      printf("iflabel%u:\n", c);
      m_tbody->generateASM();
      printf("iflabel%u:\n", d);
    }
  }

I don't see the difficulty of avoiding using global variables, I barely use them to signal errors or similar things.

Think about the parser, what should it produce? An abstract syntax tree..

How is it made? It's a n-ary tree in which every node contains some information and just its children, so there's no need for global variables.

I'll give you a peek of a language I'm writing just to give you the idea:

bexp:
  bexp T_PLUS bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_MINUS bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_TIMES bexp { $ = new ASTBExp($2,$1,$3); }
  | bexp T_DIV bexp { $ = new ASTBExp($2,$1,$3); }

uexp:
  raw_value { $ = $1; }
  | UOP_NOT uexp { $ = new ASTUExp($1,$2); }
  | T_LPAREN bexp T_LPAREN { $ = $2; }
  | var_ref { $ = new ASTVarRef((ASTIdentifier*)$1); }
  | call { $ = $1; }

As you can see every node parsed is instantiated with the childred of the grammar, which are also semantically children of the abstract syntax tree and returned in $$

The root element is something like

start: root { Compiler::instance()->setAST((ASTRoot*)$1); }
;

root:
  function_list { $ = new ASTRoot($1); }
;

in which I just get the whole tree and pass it to an instance of my Compiler class.

Now if you look at the function that calls yyparse()

bool parseSource()
{
  //yydebug = 1;
  freopen(fileName, "r", stdin);
  yyparse();

  return !failed;
}

I just open a file and call the parsing routine. This function is called by the Compiler class here:

  bool compile()
  {
    if (!parseSource())
      return false;

    if (!populateFunctionsTable())
      return false;

    ast->recursivePrint(0);
    Utils::switchStdout(binaryFile);
    ast->generateASM();
    Utils::revertStdout();

    assemble();

    return true;
  }

As you can see here the parsing routine is called, the routine creates the whole tree and then set it inside Compiler class. A recursive visit of the tree (function generateASM) does the dirty work.

I hope this clarifies a little bit how you should use your parser, let me know if you need any further info.. you don't need to do all the work in the parser. Just do the parsing one there, everything else can be solved with some recursive calls over your abstract syntax tree.

Another practical example is the if/else statement you are talking about, in the grammar it is defined as

if_stat:
  KW_IF T_LPAREN exp T_RPAREN block %prec LOWER_THAN_ELSE { $ = new ASTIfStat($3, $5); }
  | KW_IF T_LPAREN exp T_RPAREN block KW_ELSE block { $ = new ASTIfStat($3, $5, $7); }
;

A special node is created, to manage the if/else construct, which then work simply by having this generateASM function:

 void generateASM()
  { 
    if (m_fbody == NULL)
    {
      m_condition->generateASM();
      printf("NOT\n");
      printf("JUMPC iflabel%u\n", labelCounter);
      m_tbody->generateASM();
      printf("iflabel%u:\r\n", labelCounter);

      ++labelCounter;
    }
    else
    {
      u32 c = labelCounter++;
      u32 d = labelCounter++;

      m_condition->generateASM();
      printf("JUMPC iflabel%u\n", c);
      m_fbody->generateASM();
      printf("JUMP iflabel%u\n", d);
      printf("iflabel%u:\n", c);
      m_tbody->generateASM();
      printf("iflabel%u:\n", d);
    }
  }

回复收藏 0 原文

爱你是孤单的心事 2025-01-13 05:38:59

中级规则可以将值压入堆栈。

如果你有

rule
    : A B { ... } C

Bison 会自动将其转换为

some_identifier
    : /* empty */ { ... }

rule
    : A B some_identifier C

，并且可以完全按照原样访问它的值。在这种情况下，中间规则语义操作的值存储在 Bison 堆栈上，然后稍后在同一规则中再次访问。

通常，这些函数是递归的。考虑以下简单的代码片段

// C++
class Statement { public: virtual ~Statement() {} };
class Expression : public Statement {};
class IfStatement : public Statement { Statement* if_true; Expression* condition; }

// Bison
%type if_statement if_stmt
%type statement stmt
%union {
    IfStatement* if_stmt;
    Statement* stmt;
}

if_statement
    : if { $ = new IfStatement(); } 
      '(' expression { $2->condition = $4; } 
      ')' statement { $2->if_true = $7; $ = $2; }

statement
    : if_statement { $ = $1; }
    | ...

不需要外部堆栈来执行这样的递归功能。

Mid-level rules can have a value pushed on the stack.

If you have

rule
    : A B { ... } C

Bison automatically converts this into

some_identifier
    : /* empty */ { ... }

rule
    : A B some_identifier C

and it's value can be accessed exactly as such. In this case, the mid-rule semantic action had a value stored on the Bison stack and then accessed again later in the same rule.

Usually, these functions are recursive. Consider the simple following snippet

// C++
class Statement { public: virtual ~Statement() {} };
class Expression : public Statement {};
class IfStatement : public Statement { Statement* if_true; Expression* condition; }

// Bison
%type if_statement if_stmt
%type statement stmt
%union {
    IfStatement* if_stmt;
    Statement* stmt;
}

if_statement
    : if { $ = new IfStatement(); } 
      '(' expression { $2->condition = $4; } 
      ')' statement { $2->if_true = $7; $ = $2; }

statement
    : if_statement { $ = $1; }
    | ...

There's no need for an external stack to do recursive functionality like this.

回复收藏 0 原文