学习增量编译设计

发布于 2024-11-07 10:28:09 字数 223 浏览 7 评论 0原文

有很多关于创建编译器的书籍和文章,这些编译器可以一次完成所有编译工作。那么 IDE 使用的增量编译器/解析器的设计又如何呢?我熟悉第一类编译器,但我从未使用过第二类编译器。

我尝试阅读一些有关 Eclipse Java 开发工具的文章,但它们描述了如何使用完整的基础设施(即 API),而不是描述内部设计(即它内部如何工作)。

我的目标是为我自己的编程语言实现增量编译器。您会向我推荐哪些书籍或文章?

There are a lot of books and articles about creating compilers which do all the compilation job at a time. And what about design of incremental compilers/parsers, which are used by IDEs? I'm familiar with first class of compilers, but I have never work with the second one.

I tried to read some articles about Eclipse Java Development Tools, but they describe how to use complete infrastructure(i.e. APIs) instead of describing internal design(i.e. how it works internally).

My goal is to implement incremental compiler for my own programming language. Which books or articles would you recommend me?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

叹沉浮 2024-11-14 10:28:09

这本书值得一看:构建灵活的增量编译器后端。

引自 Ch. 10“结论”:

本文探讨了
增量的后端
编译系统。而不是
构建单一固定增量
编译器,本文提出了
构建此类的灵活框架
系统根据用户需求进行设计。

我认为这就是您正在寻找的...

编辑:
所以你打算创建一个所谓的“交叉编译器”?!
我开始了新的尝试。到目前为止,我还无法提供最终的参考。如果你计划这么大的项目,我相信你是一位经验丰富的程序员。因此,您可能已经知道这些链接。

Compilers.net
某些编译器的列表,甚至是交叉编译器(翻译器)。不幸的是,有些链接已损坏,但“Toba”仍然有效,并且有指向其源代码的链接。或许这可以给你带来启发。

clang:LLVM 的 C 语言系列前端
好的,它适用于 LVVM 但源代码可以在 SVN 存储库中找到,它似乎是一个前端编译器(翻译器)。也许这也能给你带来启发。

This book is worth a look: Builing a Flexible Incremental Compiler Back-End.

Quote from Ch. 10 "Conclusions":

This paper has explored the design of
the back-end of an incremental
compilation system. Rather than
building a single fixed incremental
compiler, this paper has presented a
flexible framework for constructing such
systems in accordance with user needs.

I think this is what you are looking for...

Edit:
So you plan to create something that is known as a "cross compiler"?!
I started a new attempt. Until now, I can't provide the ultimate reference. If you plan such a big project, I'm sure you are an experienced programmer. Therefore it is possible, that you already know these link(s).

Compilers.net
List of certain compilers, even cross compilers (Translators). Unfortunately with some broken links, but 'Toba' is still working and has a link to its source code. May be that this can inspire you.

clang: a C language family frontend for LLVM
Ok, it's for LVVM but source is available in a SVN repository and it seems to be a front end for a compiler (translator). May be that this can inspire you as well.

谎言 2024-11-14 10:28:09

在这一点上我不同意传统观点,因为大多数传统观点都会对你的目标做出不成文的假设,例如完整的语言设计和对极高效率的需求。从你的问题来看,我假设这些目标:

  • 学习编写你自己的语言,
  • 尝试使用你的语言,直到它看起来很优雅,
  • 尝试将代码发送到另一种语言或字节代码中以进行实际执行。

您想要构建一个黑客工具和一个递归下降解析器。

以下是您可能希望仅使用基于文本的处理器来构建线束的内容。

  1. 更改代码片段(现在为“AT 0700 SET HALLWAY LIGHTS ON FULL”)
  2. 编译片段
  3. 更改代码文件(现在为“tests.l”)
  4. 从文件编译
  5. 切换词法分析器输出(现在打开)
  6. 切换发射器输出(现在打开)
  7. 切换在家用硬件上运行(现已关闭)

    您的命令,陛下?

您可能想用 Python 或其他脚本语言编写代码。你正在优化你的游戏速度,而不是执行力。递归下降解析器可能看起来像:

def cmd_at():
    if next_token.type == cTIME:
        num = next_num()
        emit("events.setAlarm(events.DAILY, converttime(" + time[0:1] + ", " 
           + time[2:] + ", func_" + num + ");")
        match_token(cTIME)
        match_token(LOCATION)
        ...

所以你需要编写:

  • 一个用于黑客攻击的小菜单。
  • 一些词法分析例程,用于返回数字、保留字等的不同标记。
  • 一堆关于你的语言的逻辑

这种方法的目的是加快将语言组合在一起的周期。当你完成这个方法后,你就可以使用 BISON、测试工具等。

创建你自己的语言可能是一个美妙的旅程!期待学习。不要指望发财。

I'm going to disagree with conventional wisdom on this one because most conventional wisdom makes unwritten assumptions about your goals, such as complete language designs and the need for extreme efficiency. From your question, I am assuming these goals:

  • learn about writing your own language
  • play around with your language until it looks elegant
  • try to emit code into another language or byte code for actual execution.

You want to build a hacking harness and a recursive descent parser.

Here is what you might want to build for a harness, using just a text based processor.

  1. Change the code fragment (now "AT 0700 SET HALLWAY LIGHTS ON FULL")
  2. Compile the fragment
  3. Change the code file (now "tests.l")
  4. Compile from file
  5. Toggle Lexer output (now ON)
  6. Toggle Emitter output (now ON)
  7. Toggle Run on home hardware (now OFF)

    Your command, sire?

You will probably want to write your code in Python or some other scripting language. You are optimizing your speed of play, not execution. A recursive descent parser might look like:

def cmd_at():
    if next_token.type == cTIME:
        num = next_num()
        emit("events.setAlarm(events.DAILY, converttime(" + time[0:1] + ", " 
           + time[2:] + ", func_" + num + ");")
        match_token(cTIME)
        match_token(LOCATION)
        ...

So you need to write:

  • A little menu for hacking.
  • Some lexing routines, to return different tokens for numbers, reserved words, and the like.
  • A bunch of logic for what your language

This approach is aimed at speeding up the cycle for hacking together the language. When you have finished this approach, then you reach for BISON, test harnesses, etc.

Making your own language can be a wonderful journey! Expect to learn. Do not expect to get rich.

兔姬 2024-11-14 10:28:09

我看到有一个已接受的答案,但我认为此页面上可以包含一些有用的附加材料。

我阅读了有关此主题的维基百科文章,它链接到 1997 年的一篇 DDJ 文章:

http://www.drdobbs.com/cpp/codestore-and-incremental-c/184410345?pgno=1

文章的重点是首页。它解释了编辑器中的代码被分成“合并”到“CodeStore”(数据库)中的片段。这些片段通过包含未合并片段的工作队列合并。一段代码可能会被多次解析并返回到工作队列,每次尝试都会失败,直到成功通过。数据库包括各个片段之间的依赖关系,以便在编辑源代码时,可以看到对已编辑片段和其他片段的影响,并且可以重新处理这些片段。

我相信其他系统以不同的方式处理这个问题。 Java 与 C/C++ 存在不同的问题,但也有优点,因此 Eclipse 可能有不同的设计。

I see that there is an accepted answer, but I think that some additional material could be usefully included on this page.

I read the Wikipedia article on this topic and it linked to a DDJ article from 1997:

http://www.drdobbs.com/cpp/codestore-and-incremental-c/184410345?pgno=1

The meat of the article is the first page. It explains that the code in the editor is divided into pieces that are "incorporated" into a "CodeStore" (database). The pieces are incorporated via a work queue which contains unincorporated pieces. A piece of code may be parsed and returned to the work queue multiple times, with some failure on each attempt, until it goes through successfully. The database includes dependencies between the pieces so that when the source code is edited the effects on the edited piece and other pieces can be seen and these pieces can be reprocessed.

I believe other systems approach the problem differently. Java presents different problems than C/C++ but has advantages as well, so Eclipse perhaps has a different design.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文