与递归下降解析器一起生成输出
编写简单的解析器很简单,多年来我已经实现了几个。在大学里,我们也必须写一篇。但我们从来不需要使用这种方法生成有意义的输出; 我们从未学习过如何创建后端。
如果我有一个用于简化 Pascal 的有效递归下降解析器,并且我想将代码转换为 C++,我将如何去做呢?我认为不需要中间步骤,例如生成抽象语法树。
那么我该如何输出编译或翻译的代码? 我发现的唯一有用的示例是 Jack Crenshaw 的教程,但它更多地关注前端,就像大多数其他资源一样。我的解析器代码和语法之间的关系非常明显。解析器方法和输出之间的关系如何?我的函数声明解析器方法可以与一个 EmitLn(此处为 C++ 代码)调用相关。但是解析器方法并不那么容易,例如表达式。表达式被分解为可能更多的调用,因此暗示需要 Emit() 函数,该函数允许我逐段分解表达式的输出代码。 是否有用于输出代码的样板代码,例如 Jack Crenshaw 的 Lets Build a Compiler 中的 EmitLn 函数?这也表明我需要维护一个基本的符号表,这是大多数示例中经常忽略的另一件事。
我说得对吗?我还应该知道什么?有什么提示/建议或资源吗?我的大问题是,编译器前端有很多教程,但是后端的一些解释怎么样?我可以解析语言,现在我想将其发展为能够将它们翻译并编译成其他语言。
Writing simple parsers are trivial and I have implemented several over the years. In college, we also had to write one. But we never had to generate meaningful output using this method; we never learned how to create a back end.
If I have a working recursive descent parser for a simplified Pascal and I wanted to translate the code to C++, how would I go about doing this? I dont see a need for an intermediate step such as generating an abstract syntax tree.
So how do I go about outputting the compiled or translated code? The only useful example I have found of this is in Jack Crenshaw's tutorial but it focuses more on the front end, as does most other resources. The relationship between my parser code and my grammar is very obvious. What about the relationship between the parsers methods and the outputting? My parser method for a function declaration can relate to exactly one EmitLn(C++ code here) call. But what about parser methods that aren't that easy, such as expressions. Expressions are broken down to possibly many more calls, so that alludes to the need of a Emit() function that allows me to break down the output code for an expression piece by piece. Is there any boiler plate code for outputting code, such as the EmitLn function in Jack Crenshaw's Lets Build a Compiler? This also shows me I need to maintain a basic symbol table, another thing that is often omitted in most examples.
Am I right? What else should I know? Any tips/suggestions, or resources? My big question is, there are so many tutorials out there for compiler front ends, but how about some explanation on the back end? I can parse languages, now I want to evolve that into being able to translate and compile them into other languages.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
有一些简单的编译器使用“动态”代码生成器来吐出代码
正如他们解析的那样,正如您似乎正在尝试做的那样。好消息是他们看起来
简单的。
问题在于,编译从根本上来说就是传播信息
将代码的一部分复制到另一部分,以实现良好的代码生成。为此,
编译器人员很久以前就决定将解析与代码生成分开。
解析器解析代码并构建另一个中间数据结构(通常是一个
代表程序的抽象语法树或三元组。
通常,对中间结构进行非常深入的分析
确定如何生成好的代码,然后适当的好
代码生成。做好这项工作的技术很复杂,
但由于需要生成良好(且正确)的代码而受到充分的激励。
尝试检查标准编译器资源。
学习编写编译器
这是非常值得你的
是时候详细阅读其中一些参考文献而不是黑客攻击了。
如果你坚持只看一本,《阿霍和厄尔曼》就是一本经典之作。
如果您想了解如何动态生成代码
使其工作得相当好,检查一下
元编译器。
There are trivial compilers using "on the fly" code generators that spit code
as they parse, as you seem to be trying to do. The good news is that they look
easy.
The problem is that compilation is fundamentally about spreading information from
one part of the code to another, to enable good code generation. For that,
compiler people long ago decided to separate parsing from code generation.
The parser parses the code and builds another intermediate data structure (often an
abstract syntax tree or triples) that represents the program.
Often, very deep analysis is then lavished on the intermediate structure
to determine how to generate good code, followed by appropriate good
code generation. The techniques for doing this well are complex,
but well motivated by the need to produce good (and correct) code.
Try checking out the standard compiler resources.
Learning to write a compiler
It is well worth your
time to read some of these references in detail instead of hacking.
If you insist on just one, Aho and Ullman is the classic book.
If you want to see the how on the fly code generation can
be made to work reasonably well, check out
metacompilers.
简答 在匹配语法中的规则后,调用一个函数来根据输入执行某些操作。
长答案:来自 Terence Parr 讨论语法导向解释器的语言实现模式:
这正是我在最初的问题中想要表达的想法。他在书中继续描述了该技术、其工作原理、其组成以及局限性。它适用于 DSL 和小语言,而不是通用语言。
他指出,IF 和循环等结构很难实现。虽然这可能是真的,但我用我的语言 Peay BASIC 实现这些功能没有遇到任何困难。由指令序列或简单语句组成的语言非常适合这种模式;他以文本处理语言为例。
语法被扩展为由“匹配这个,调用那个”对组成。因此,语法中的必要规则应该有一个与之相关的操作。赋值语句示例:
Short answer After matching a rule in your grammar, call a function that does something based on the input.
Long answer: From Language Implementation Patterns by Terence Parr discussing Syntax-Directed Interpreters:
This is precisely the idea that I was going for in my original question. Continuing in the book he describes the technique, how it works, what it consists of, and the limitations. It works for DSL and small languages rather than general purpose languages.
He states that constructs such as IFs and loops are awkward to implement. While this is likely true, I had no trouble implementing such features in a language of mine, Peay BASIC. Languages that are sequences of instructions or simple statements are ideal for this pattern; he gives text-processing languages as an example.
The grammar is extended to consist of "match this, call that" pairs. So the neccessary rules in your grammar should have an action associated with it. Example of an assignment statement: