语法高亮显示:Eclipse 如何做到如此快?

发布于 2024-12-01 20:50:40 字数 245 浏览 2 评论 0原文

我用 Java 为 Android 开发了一个语法荧光笔,它运行良好,但问题是处理大文件时速度可能会很慢。

所以我想知道像 Eclipse 和 Gedit (Ubuntu) 这样的源代码编辑器如何快速突出显示您刚刚编写的内容。例如,如果您在编写 HTML 标签时输入结尾大于符号,它会立即突出显示该标签。

即使是大文件,它怎么这么快?他们是否有特定的方法来执行此操作,或者他们只是对您所在的行执行语法突出显示?

谢谢, 亚历克斯

I've developed a syntax highlighter in Java for Android and it's working well, but the problem is it can be slow with big files.

So I'm wondering how source code editors like Eclipse and Gedit (Ubuntu) highlight what you've just wrote so quickly. For example, if you enter the ending greater than symbol when writing a HTML tag, it highlights the tag instantly.

How is it so quick, even with big files? Is there a specific way they go about doing it or do they just perform the syntax highlighting for the line you're on?

Thanks,
Alex

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

分开我的手 2024-12-08 20:50:40

我不能谈论 Gedit,但在 Eclipse 中,我们作弊:-)

如果您仔细观察,您实际上可以看到,像 Java 这样的结构化语言的语法着色是一个两阶段的过程。

首先,运行演示协调器来执行非常基本的语法着色。这是在编辑器文档发生更改时立即触发的,预计速度非常快。它实际上不是基于语法的着色,而是基于词汇的着色。因此,重点是字符串、关键字、单词、数字、注释等标记,所有标记都可以根据简单的字符表或类似表轻松识别。因此,类名、变量名或静态方法名之间没有区别,尽管它们最终的颜色可能不同。
对于许多语言来说,这是唯一完成的着色。

接下来,运行语法协调器来为文档构建抽象语法树 (AST),或者尽可能接近语法错误或语义错误。这是由计时器触发的,对于某些语言,会尝试对 AST 进行部分更新(并不容易)。然后,完成的 AST 用于更新大纲视图,然后根据附加信息(例如静态方法名称)进行附加语法着色。 (AST 经常用于许多其他用途,例如悬停信息、折叠、超链接等。

对于初始演示协调器和后来基于语法的协调器,一些相当复杂的逻辑决定了文档的区域必须有多大对于表示协调器,决策可以基于任何现有的着色,而对于基于语法的着色,运行中的单独损坏/修复阶段以确定区域的大小,

总是使问题复杂化的一些极端例子是当块注释时。添加或删除

a = b /* c + 1 /* remember the offset! */;

如果删除或添加第一个斜杠,演示文稿协调器必须处理比天真的预期更大的区域......

I cannot talk for Gedit, but in Eclipse, we cheat :-)

If you look very carefully, you can actually see that syntax coloring for structured languages like Java is a two-phase process.

First, a presentation reconciler is run to do very basic syntax coloring. This is done immediately triggered on changes in the document of the editor and is expected to be extremely fast. It is really not syntax-based coloring, but actually lexically-based coloring. So the focus is on tokens like strings, keywords, words, numbers, comments, etc - all tokens that can be recognized easily based on simple character tables or similar. Thus there are no difference between a class name, a variable name or a static method name, even though they may be colored different in the end.
For many languages, this is the only coloring done.

Next, a syntax reconciler is run to build an abstract syntax tree (AST) for the document - or as near as you can get in the face of syntax errors or semantic errors. This is triggered by a timer and for some languages an attempt is made to just do a partial update of the AST (not easy). The completed AST is then used to update the outline view and then do additional syntax coloring based on the additional information - e.g. static method name. (The AST is often used for many other things, like hover information, folding, hyperlinking, etc.

Both for the initial presentation reconciler and the later syntax based reconciler, some rather elaborate logic determines just how big a region of the document that must be parsed. For the presentation reconciler the decision can be based on any existing coloring, whereas for the syntax based coloring a separate damage/repair phase in run to determine the size of the region.

Some extreme examples that always complicate matters are when block comments are added or removed

a = b /* c + 1 /* remember the offset! */;

If the first slash is removed or added, the presentation reconciler must process a larger area, than what can be naively expected...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文