解析、突出显示和补全之间的关系
一段时间以来,我一直在考虑从头开始设计一种小型玩具语言,不会“统治世界”,但主要是作为一种练习。我意识到为了实现这一目标,需要学习很多东西。
这个问题是关于三个不同的概念(解析、代码突出显示和完成),这三个概念让我觉得非常相似。当然,解析和 ASTgen 是编译的一部分,而代码高亮和补全更多是 IDE 的功能,但我不知道有什么相同点和不同点。
我需要在这个主题上更有经验的人的一些提示。这些概念之间可以共享哪些代码以及在这个意义上可以提供帮助的架构注意事项是什么?
For some time now I've been thinking about designing a small toy language from scratch, nothing that will "Rule The World", but mostly as an exercise. I realize there is a lot to learn in order to accomplish this.
This question is about three different concepts (parsing, code highlighting and completion) that strike me as extremely similar. Of course, parsing and ASTgen is part of the compilation, while code highlighting and completion is more of a feature of the IDE, yet I wonder what are the similarities and differences.
I need some hints from someone more experienced in this topic. What code can be shared between these concepts and what are the architecture considerations that could help in this sense?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您想要的是一个语法导向的结构编辑器。这是一种将解析与 AST 构建相结合的方法,并使用解析器来预测您接下来可以输入的内容(语法完成),或者与编译器的上次运行有关,以便它可以解释编辑点以查看有效的标识符可能是什么接下来,检查代码中该点最后相关的编译器符号表。
最困难的部分是为用户提供无缝的体验;她几乎必须相信她正在编辑文本,否则(结构编辑器的经验表明)她会因为尴尬而拒绝它。
这是需要协调很多机器并且付出相当大的努力。好消息是编译器无论如何都需要一个解析器;如果编辑也解析,那么编译器所需的 AST 基本上就可用了。 (当然你也必须担心批量编译)。编译器必须建立一个符号表;这样您就可以在编辑完成过程中使用它。更困难的消息是解析器的构建要困难得多。他们不能只是声明用户可见的语法错误并退出;相反,它们必须容忍同时存在的许多错误,保留各个片段的部分 AST,并在用户删除错误时将它们缝合在一起。
Berkeley Harmonia 人员在这一领域做得很好。花点时间阅读他们的一些论文以详细了解问题以及处理这些问题的方法是非常值得的。
其他主要方法的人(特别是意向编程和XText) 似乎正在尝试面向对象的编辑器,您可以将编辑操作附加到每个 AST 节点,并将屏幕上的每个点与 AST 节点相关联。然后编辑操作调用 AST 节点特定操作(插入字符、向右、向上……),它可以决定如何操作以及如何修改屏幕。可以说你可以让这些编辑做任何事情;实际操作起来有点困难。我用过这些编辑器;他们感觉不像文本编辑器。有一些热心用户,但是 YMMV。
我认为您可能应该在尝试构建这样的编辑器与尝试定义新语言之间做出选择。同时做这两件事可能会给你带来麻烦。
What you want is a syntax-directed structure editor. This is one that combines parsing with AST building and uses the parser to predict what you can type next (either syntax completion), or has a tie to the compiler's last run, so that it can interpret the edit point to see what valid identifiers might come next by inspecting the compiler's symbol table that was last relevant at that point in the code.
The most difficult part is offering the user a seamless experience; she pretty much has to believe she is editing text or (experience with structure editors shows) she will reject it as awkward.
This is a lot of machinery to coordinate and quite a big effort. The good news is that you need a parser anyway for the compiler; if editing also parses, the AST needed by the compiler is essentially available. (Of course you have to worry about batch compiling, too). The compiler has to build a symbol table; so you can use that in the editing completion process. The more difficult news is that the parsers are a lot harder to build; they can't just declare a user-visible syntax error and quit; rather they have to be tolerant of a number of errors extant at the same moment, hold partial ASTs for the pieces, and stitch them together as the errors are removed by the user.
The Berkeley Harmonia people are doing good work in this area. It is well worth your trouble to read some of their papers to get a detailed sense of the problems and one approach to handling them.
THe other major approach people (notably Intentional Programming and XText) seem to be trying are object-oriented editors, where you attach editing actions to each AST node, and associate every point on the screen with an AST node. Then editing actions invoke AST-node specific actions (insert-character, go right, go up, ...) and it can decide how to act and how to modify the screen. Arguably you can make these editors do anything; its a little harder in practice. I've used these editors; they don't feel like text editors. There are some enthusiastic users, but YMMV.
I think you probably ought to choose between trying to build such an editor, vs. trying to define a new langauge. Doing both at once is likely to overwhelm you with troubles.