智能缩进算法文档?

发布于 2024-08-02 20:51:15 字数 822 浏览 7 评论 0原文

我非常热衷于记录 IDE 功能的正确行为,这些功能对编码流程有着微妙但重大的影响 - 例如自动完成选择注释/取消注释代码你可能没有意识到你利用了这一点,但最终你完成的事情比你可能做的多了一点。我这样做是希望我必须使用的其他语言服务能够包含这些功能,从而改善我的日常编码生活。 “真正的”智能缩进,即 Visual Studio 2008 C# 编辑器,就是其中之一。

基本的块代码缩进相当简单,并且可以在合理的时间内很好地组合在一起以完成工作。另一方面,真正的智能缩进很可能是迄今为止我必须在 IDE 中实现的最具技术挑战性的任务,并且我已经实现了我应得的份额。即使是成熟的动态自动代码重新格式化也更加容易;它只是遵循智能缩进来完成繁重的工作。

我正在寻找通用智能缩进算法的高级讨论。 特别是,我正在寻找有关智能缩进策略的研究,或者对所有正常和“边缘”情况的客观描述,这些情况可以经过测试以确保结果可重复、无错误。最终,我我希望提供详细的功能工作流程、实际实现该功能的具体基础,并最终从中组装出特定于语言的版本并将其集成到我的语言服务中。

PS:Visual Studio 2010 的 C# 编辑器在此功能中存在几个小错误。在我自己实现了它之后,我对完善它所需的工作有了全新的尊重。

编辑(8/25):我设法写下规则草案,说明当智能缩进位于代码注释内时应如何处理事情。我可能会从 C++/C# 的角度来研究规则,但稍后它们应该能够针对其他语言的各个方面进行参数化。

I'm a big fan of documenting the proper behavior of IDE features that have a subtle but significant impact on coding flow - things like auto-completion selection and commenting/uncommenting code you might not realize you take advantage of but at the end of the day you got just a bit more done than you might have. I do so in hopes that other language services I have to use incorporate the feature(s), subsequently improving my daily coding life. "Real" Smart Indent, i.e. the Visual Studio 2008 C# editor, is one of those features.

Basic block code indentation is reasonably straightforward and can be hacked together in a reasonable amount of time well enough to get the job done. True Smart Indent, on the other hand, is quite possibly the most technically challenging task I've had to implement in the IDE to date, and I've implemented my fair share. Even full-blown on-the-fly automatic code reformatting is easier; it just defers to Smart Indent for the heavy lifting.

I'm looking for high-level discussions of general purpose Smart Indent algorithms. In particular, I'm looking for either research on smart indent strategies, or an objective description of all normal and "edge" cases that could be tested to ensure repeatable, bug-free results. Eventually, I'd like to provide both a detailed workflow of the functionality, a concrete foundation for actually implementing the feature, and finally assembling a language-specific version from that and integrating it into my language services.

PS: Visual Studio 2010's C# editor has several small bugs in this feature. Having implemented it myself, I have a whole new respect for the work it takes to polish it.

Edit (8/25): I managed to write down a draft the rules for how I think things should be handled when the smart indent is inside a code comment. I'll probably be working from a C++/C# perspective on the rules, but later they should be able to be parameterized for aspects of other languages.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

铁轨上的流浪者 2024-08-09 20:51:16

与另一位响应者一样,正确执行此操作的关键思想是漂亮打印,
也就是说,从代码的抽象语法结构生成文本。

基本上你利用树的嵌套来产生嵌套
的打印文本。关键思想是从树的叶子构建原始字符串的概念,将子树中其他框的水平框[文本矩形]粘合在一起以提供水平组合,并将框彼此粘合以获得更大的垂直框。

棘手的部分:使用树叶中的格式化信息重新生成语言文字(该二进制浮点数有多少个前导零?),
通过允许替代框布局和回溯来处理右边距溢出,以及模式匹配复杂的树结构以漂亮的方式打印特定的树(例如,嵌套的 if-then-if-then-if ....)

这里是 关于该主题的研究论文 (全文 PDF)

这是我们为漂亮打印所做的DMS 软件再工程工具包用于漂亮打印大规模元编程生成的 AST。

Like another responder, the key idea for doing this right is prettyprinting,
that is, generating text from the abstract syntax structure of the code.

Basically you take advantage of the nesting of the tree to produce nesting
of the printed text. Key ideas are the notion of building primitive strings from leaves of the tree, gluing horizontal boxes [rectangles of text] together from other boxes from subttrees to provide horizontal composition, and gluing boxes on top of one another to get bigger vertical boxes.

Tricky parts: regenerating the langauge literals with formatting information from the tree leaves (just how many leading zeros did that binary float point number have?),
handling right margin overflow by allowing alternative box layouts and backtracking, and pattern matching complex tree structures to prettyprint particular trees in nice ways (e.g., nested if-then-if-then-if....)

Here's a research paper on the topic (Full text PDF).

Here's what we did for prettyprinting with the DMS Software Reengineering Toolkit to prettyprint ASTs generated by large-scale metaprogramming.

℉絮湮 2024-08-09 20:51:16

也许我遗漏了一些东西,但“智能缩进”将完全与语言的语法规范联系在一起。事实上,经过一番谷歌搜索后,我能找到的最接近学术论文的是另一个与特定语言相关的问题,此处

所以,恐怕我无法从技术上提供答案,因为我没有找到任何学术论文,但作为一种元答案(遗憾的是,以问题的形式):它比解析语言?我在模糊的可计算性/复杂性意义上使用“更难”一词,而不是指一个人实际投入的实际时间/努力/眼泪。

考虑一下:根据我的经验,在某些子条款内,缩进级别发生变化。 If 语句、循环、类、结构等等。所有这些都已被解析器检测到。就像我们可以装饰一棵解析树来构建一棵语义树一样(这是一所随机大学的碎片 website),你不能用“缩进信息”来装饰解析树吗?

我想我只是不明白征集学术论文的意义何在。当然,除非我遗漏了一些东西。这是很有可能的,因为我从来不敢尝试这个。 :)但是,从我的角度来看,这种智能缩进似乎只需运行修改后的解析器就可以实现,并且它不会报告“解析错误”,而是自动重新格式化代码以使其有效(假设“真实的” “解析器已经同意该块)。实时运行肯定会导致问题,并且在依赖于空格的语言中存在不明确的缩进级别(因为缩进级别块的末尾)。

最后(老实说,我快完成了!:))注意:根据我的经验,Emacs 文本编辑器非常好。我不知道它是如何工作的,但如果我要尝试这个,那将是我首先会看的地方......当然,在SO之后。 :))

Maybe I'm missing something, but the "smart indentation" would be entirely tied up in the grammar specification of the language. The closest thing to an academic paper I could find after a bit of google-fu was, in fact, another SO question pertaining to a particular language, here.

So, I'm afraid I can't technically provide an answer, as I did not find any academic papers, but as a sort of meta-answer (sadly, in the form of a question): is it any harder than parsing the language? I use the term "harder" in a vague computability/complexity sense, not referring to the actual time/effort/tears a person would actually put in.

Consider: indentation level changes, in my experience, within certain sub-clauses. If statements, loops, classes, structs, etc. etc. All of these are already detected by the parser. Just as one can decorate a parse tree to build a semantic tree (here's a shard of a random university website), can't you instead decorate the parse tree with "indent information"?

I guess I just don't see what the call for academic papers is all about. Unless if, of course, there's something I'm missing. Which is quite possible, as I've certainly never dared attempt this. :) But, from my vantage point, it would seem that this smart indenting is possible simply by running a modified parser, and instead of reporting "parse errors", it automatically reformats the code so that it is valid (assuming that the "real" parser already okays the block). Real-time running would certainly cause issues, and there are ambigous levels of indentation in whitespace-dependent language (as the indent level is the end of the block).

As a final (honestly, I'm almost done! :)) note: the Emacs text editory is shockingly good, in my experience. I have no idea how it works, but if I were to try this, that would be the first place I'd look... after SO, of course. :))

水水月牙 2024-08-09 20:51:15

Emacs CC 模式手册:缩进引擎基础知识.

Steve Yegge 博客咆哮:js2-mode:Emacs 的新 JavaScript 模式

引用后者:“令人惊讶的是,令人惊讶的是,违反直觉的是,缩进问题几乎完全正交< /em> 进行解析和语法验证。”

Emacs CC Mode manual: Indentation Engine Basics.

Steve Yegge blog rant: js2-mode: a new JavaScript mode for Emacs.

Quote from the latter: "Amazingly, surprisingly, counterintuitively, the indentation problem is almost totally orthogonal to parsing and syntax validation."

萌化 2024-08-09 20:51:15

您正在寻找的神奇搜索短语可能是“漂亮的打印"。

The magic search phrase you are looking for might be "pretty print".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文