学习增量编译设计
有很多关于创建编译器的书籍和文章,这些编译器可以一次完成所有编译工作。那么 IDE 使用的增量编译器/解析器的设计又如何呢?我熟悉第一类编译器,但我从未使用过第二类编译器。
我尝试阅读一些有关 Eclipse Java 开发工具的文章,但它们描述了如何使用完整的基础设施(即 API),而不是描述内部设计(即它内部如何工作)。
我的目标是为我自己的编程语言实现增量编译器。您会向我推荐哪些书籍或文章?
There are a lot of books and articles about creating compilers which do all the compilation job at a time. And what about design of incremental compilers/parsers, which are used by IDEs? I'm familiar with first class of compilers, but I have never work with the second one.
I tried to read some articles about Eclipse Java Development Tools, but they describe how to use complete infrastructure(i.e. APIs) instead of describing internal design(i.e. how it works internally).
My goal is to implement incremental compiler for my own programming language. Which books or articles would you recommend me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这本书值得一看:构建灵活的增量编译器后端。
引自 Ch. 10“结论”:
我认为这就是您正在寻找的...
编辑:
所以你打算创建一个所谓的“交叉编译器”?!
我开始了新的尝试。到目前为止,我还无法提供最终的参考。如果你计划这么大的项目,我相信你是一位经验丰富的程序员。因此,您可能已经知道这些链接。
Compilers.net
某些编译器的列表,甚至是交叉编译器(翻译器)。不幸的是,有些链接已损坏,但“Toba”仍然有效,并且有指向其源代码的链接。或许这可以给你带来启发。
clang:LLVM 的 C 语言系列前端
好的,它适用于 LVVM 但源代码可以在 SVN 存储库中找到,它似乎是一个前端编译器(翻译器)。也许这也能给你带来启发。
This book is worth a look: Builing a Flexible Incremental Compiler Back-End.
Quote from Ch. 10 "Conclusions":
I think this is what you are looking for...
Edit:
So you plan to create something that is known as a "cross compiler"?!
I started a new attempt. Until now, I can't provide the ultimate reference. If you plan such a big project, I'm sure you are an experienced programmer. Therefore it is possible, that you already know these link(s).
Compilers.net
List of certain compilers, even cross compilers (Translators). Unfortunately with some broken links, but 'Toba' is still working and has a link to its source code. May be that this can inspire you.
clang: a C language family frontend for LLVM
Ok, it's for LVVM but source is available in a SVN repository and it seems to be a front end for a compiler (translator). May be that this can inspire you as well.
在这一点上我不同意传统观点,因为大多数传统观点都会对你的目标做出不成文的假设,例如完整的语言设计和对极高效率的需求。从你的问题来看,我假设这些目标:
您想要构建一个黑客工具和一个递归下降解析器。
以下是您可能希望仅使用基于文本的处理器来构建线束的内容。
切换在家用硬件上运行(现已关闭)
您的命令,陛下?
您可能想用 Python 或其他脚本语言编写代码。你正在优化你的游戏速度,而不是执行力。递归下降解析器可能看起来像:
所以你需要编写:
这种方法的目的是加快将语言组合在一起的周期。当你完成这个方法后,你就可以使用 BISON、测试工具等。
创建你自己的语言可能是一个美妙的旅程!期待学习。不要指望发财。
I'm going to disagree with conventional wisdom on this one because most conventional wisdom makes unwritten assumptions about your goals, such as complete language designs and the need for extreme efficiency. From your question, I am assuming these goals:
You want to build a hacking harness and a recursive descent parser.
Here is what you might want to build for a harness, using just a text based processor.
Toggle Run on home hardware (now OFF)
Your command, sire?
You will probably want to write your code in Python or some other scripting language. You are optimizing your speed of play, not execution. A recursive descent parser might look like:
So you need to write:
This approach is aimed at speeding up the cycle for hacking together the language. When you have finished this approach, then you reach for BISON, test harnesses, etc.
Making your own language can be a wonderful journey! Expect to learn. Do not expect to get rich.
我看到有一个已接受的答案,但我认为此页面上可以包含一些有用的附加材料。
我阅读了有关此主题的维基百科文章,它链接到 1997 年的一篇 DDJ 文章:
http://www.drdobbs.com/cpp/codestore-and-incremental-c/184410345?pgno=1
文章的重点是首页。它解释了编辑器中的代码被分成“合并”到“CodeStore”(数据库)中的片段。这些片段通过包含未合并片段的工作队列合并。一段代码可能会被多次解析并返回到工作队列,每次尝试都会失败,直到成功通过。数据库包括各个片段之间的依赖关系,以便在编辑源代码时,可以看到对已编辑片段和其他片段的影响,并且可以重新处理这些片段。
我相信其他系统以不同的方式处理这个问题。 Java 与 C/C++ 存在不同的问题,但也有优点,因此 Eclipse 可能有不同的设计。
I see that there is an accepted answer, but I think that some additional material could be usefully included on this page.
I read the Wikipedia article on this topic and it linked to a DDJ article from 1997:
http://www.drdobbs.com/cpp/codestore-and-incremental-c/184410345?pgno=1
The meat of the article is the first page. It explains that the code in the editor is divided into pieces that are "incorporated" into a "CodeStore" (database). The pieces are incorporated via a work queue which contains unincorporated pieces. A piece of code may be parsed and returned to the work queue multiple times, with some failure on each attempt, until it goes through successfully. The database includes dependencies between the pieces so that when the source code is edited the effects on the edited piece and other pieces can be seen and these pieces can be reprocessed.
I believe other systems approach the problem differently. Java presents different problems than C/C++ but has advantages as well, so Eclipse perhaps has a different design.