在编译器中生成中间代码。处理条件时是否总是需要 AST 或解析树?

发布于 2024-10-24 05:08:58 字数 525 浏览 5 评论 0原文

我正在参加一个编译器设计课程,我们必须实现我们自己的编译器(使用 flex 和 bison)。我有解析方面的经验(编写 EBNF 和递归下降解析器),但这是我第一次编写编译器。

语言设计是相当开放的(教授把它留给了我们)。在课堂上,教授回顾了生成中间代码的过程。他说我们在解析时不需要构造抽象语法树或者解析树,我们可以直接生成中间代码。

我发现这令人困惑有两个原因:

  • 如果您在定义函数之前调用函数怎么办?如何解决分支目标?我想你必须制定一条规则,即在使用函数之前必须定义它们,或者可能预先定义它们(就像 C 那样?)

  • 你将如何处理条件?如果您有一个 if-else 甚至只是一个 if,当条件为 时,如何解析 if 的分支目标>false(如果您正在生成代码)?

我计划生成一个 AST,然后在创建它后遍历树,以解析函数和分支目标的地址。这是正确的还是我错过了什么?

I'm taking a compiler-design class where we have to implement our own compiler (using flex and bison). I have had experience in parsing (writing EBNF's and recursive-descent parsers), but this is my first time writing a compiler.

The language design is pretty open-ended (the professor has left it up to us). In class, the professor went over generating intermediate code. He said that it is not necessary for us to construct an Abstract Syntax Tree or a parse tree while parsing, and that we can generate the intermediate code as we go.

I found this confusing for two reasons:

  • What if you are calling a function before it is defined? How can you resolve the branch target? I guess you would have to make it a rule that you have to define functions before you use them, or maybe pre-define them (like C does?)

  • How would you deal with conditionals? If you have an if-else or even just an if, how can you resolve the branch target for the if when the condition is false (if you're generating code as you go)?

I planned on generating an AST and then walking the tree after I create it, to resolve the addresses of functions and branch targets. Is this correct or am I missing something?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

眼趣 2024-10-31 05:08:58

解决这两个问题的一般方法是保留需要“修补”的地址列表。您生成代码并为丢失的地址或偏移量留下漏洞。在编译单元的末尾,您将检查漏洞列表并填充它们。

在 FORTH 中,补丁“列表”保留在控制堆栈上,并在每个控制结构终止时展开。请参阅 FORTH Dimensions

轶事:早期的 Lisp 编译器(我相信是 Lisp)生成了一个列表符号格式的机器代码指令,以及对条件的每个分支的机器代码列表的前向引用。然后它生成向后遍历列表的二进制代码。这样,当需要发出分支指令时,就知道所有前向分支的代码位置。

The general solution to both of your issues is to keep a list of addresses that need to be "patched." You generate the code and leave holes for the missing addresses or offsets. At the end of the compilation unit, you go through the list of holes and fill them in.

In FORTH the "list" of patches is kept on the control stack and is unwound as each control structure terminates. See FORTH Dimensions

Anecdote: an early Lisp compiler (I believe it was Lisp) generated a list of machine code instructions in symbolic format with forward references to the list of machine code for each branch of a conditional. Then it generated the binary code walking the list backwards. This way the code location for all forward branches was known when the branch instruction needed to be emitted.

蛮可爱 2024-10-31 05:08:58

Crenshaw 教程使用任何类型 AST 的具体示例。它构建了一个工作编译器(显然包括条件),可以立即生成针对 m68k 汇编的代码。

你可以花一个下午的时间读完这个文档,这是值得的。

The Crenshaw tutorial is a concrete example of not using an AST of any kind. It builds a working compiler (including conditionals, obviously) with immediate code generation targeting m68k assembly.

You can read through the document in an afternoon, and it is worth it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文