有关将语法树转换为程序集的资源?
主要作为学习练习,我从头开始编写虚拟机、汇编器和编译器,不依赖任何外部工具。
我相信我对虚拟机和汇编器的工作方式以及编译器的某些部分有很好的概念性理解。
这是我想知道的: 在编译器中,假设我已经将源代码变成了语法树。我要经历什么过程才能将此语法树转换为汇编?
(让我们假设一些简单的语言结构,例如 if 和 while。我在这里寻找一个最小且简单的解释。)
我不是特别感兴趣复杂的解决方案或基于现有工具的解决方案。相反,我想要一份大约一页纸的内容,广泛全面地描述从语法树到汇编背后的想法。
有人知道这样的资源吗?
谢谢 :)
Primarily as a learning exercise, I am writing a virtual machine, an assembler, and a compiler from scratch, depending on no external tools.
I believe I have a decent conceptual understanding of how the virtual machine and assembler will work, as well as some parts of the compiler.
Here's what I want to know:
In the compiler, suppose I have turned the source code into a syntax tree. What process do I go through to then convert this syntax tree to assembly?
(Let's assume some simple language constructs, like if and while. I'm looking for a minimal and simple explanation here.)
I am not particularly interested in complex solutions, or solutions based on existing tools. Rather, I'd like something on the order of a 1-page, broad sweeping description of the ideas behind going from syntax tree to assembly.
Anyone know of such a resource?
Thanks :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对编译器问题的强制性回答是阅读 Dragon 书(编译器:原理、技术和工具)。当你说你把源代码变成了语法树时,你到底是什么意思?通常解析的第一阶段是创建抽象语法树(AST)。下一步通常是待办事项归因。属性是 AST 中节点的属性,不一定与源语言有任何关系,但对于代码生成至关重要。通常,此处会进行某种形式的类型检查,以确定内存大小要求,以及在面向对象语言中,确定要调用的函数。例如,如果您的源是 obj1=obj2+obj3,则在确定 obj2 的类型之前,您并不真正知道如何使用加号。
所以尝试回答你的问题。 1)将源代码解析为AST。 2)对AST进行归因。 3)生成中间代码(想象一下你所说的汇编)。
《龙》书的第 5 章和第 6 章详细介绍了这一点。实际上,棘手的部分是弄清楚代码生成所需的属性。此外,if 语句还存在一些棘手的问题。例如,如果 if 条件失败,您知道需要跳过某些代码,但至少在最初,您不知道跳多远。背面修补是解决此问题的一种方法。
The obligatory response to a compiler question is to read the Dragon book (Compilers: Principles, Techniques and Tools). When you say that you have turned the source code into a syntax tree, what exactly do you mean? Usually the first stage in parsing is to create an abstract syntax tree (AST). The next step is usually to-do attribution. Attributes are properties of nodes in the AST that don't necessarily have anything to do with the source language, but are essential to code generation. Usually some form of type checking is done here to determine memory size requirements and, in object oriented languages, what function is to be called. For instance, if your source is obj1=obj2+obj3, You don't really know what to make of the plus sign until you determine the type of obj2.
So to give a shot at answering your question. 1) Parse source code to AST. 2) Do attribution on the AST. 3) Generate intermediate code (what imagine you are referring to as assembly).
Chapters 5 and 6 of the Dragon book cover this in good detail. Really the tricky part is figuring out what attributes you need for code generation. Also, there are some tricky issues with if statements. For instance if the if condition fails you know you need to jump over some code but, at least initially, you don't know how far. Back patching is one solution to this problem.