解析理论和实时语法突出显示
我试图了解在处理非常大的字符串时应该如何实现实时语法突出显示。我很困惑。这就是我所知道的:(
假设我有函数parsedString parseString(rawString)
)
调用
parseString(entireText)
并用返回的解析的(和样式等)每次文本更改时的字符串。在处理大数据时,这似乎是一个不好的方法。有人建议分析编辑范围,并用解析后的字符串
parseString(editedRange)
替换当前的原始编辑字符串。
方法(1)已经很清楚了。我无法理解的是(2)。键入时,对于添加到字符串中的每个字符,都会触发通知,并且正在解析单个字符(并按原样返回)。
例如,如果我在解析 .css 文件时想要红色选择器,我如何理解何时有一个完整的选择器后跟一个应该着色的 {
?我想有某种方法可以延迟解析直到出现匹配。你如何实现这一点?
我不是在寻找有效的应用程序。一个好的解释也会很有用。
先感谢您。
I am trying to understand how I should implement live syntax highlight when processing a very big string. I'm quite confused. This is what I know:
(Supposing I have the function parsedString parseString(rawString)
)
Call
parseString(entireText)
and replace the current string with the returned parsed (and styled, etc.) string on every text change. This seems a bad approach when handling big data.Someone suggested to analyze the edited range, and replace the current raw edited string with the parsed string
parseString(editedRange)
.
Method (1) is clear enough. What I cannot understand is (2). When typing, for each character added to the string, the notification is fired, and a single character is being parsed (and returned as it is).
For example, if I want red selectors when parsing a .css file, how I can understand when there's a completed selector followed by a {
that should be colored? I suppose there is some way to delay the parsing until there is a match. How do you implement this?
I'm not looking for a working application. A good explanation would be useful as well.
Thank you in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
要重新解析增量更改,您的解析器需要一个较低级别的 API。
解析器的状态会在处理输入时发生变化。例如,首先解析器可能会跳过空格,现在可能会读取数字,稍后可能会为表达式构建抽象语法树。如果您可以在输入中的里程碑点处拍摄所有解析器状态信息的快照,那么您可以通过从更改之前的最后一个里程碑开始重新解析增量更改(并且如果状态在超出该里程碑的里程碑处相同,则可能会更早停止)改变)。
对于简单的语法突出显示,就像许多程序员编辑器所做的那样,这就是方法。语法突出显示只需要标记化,因此无需捕获太多状态。许多编程语言都有很多里程碑的机会,例如,在新行的开头。在这些情况下,您甚至可能不需要实际保存解析器状态,因为您可能知道它在行的开头始终是相同的。
所以你需要一个像这样的API:
To re-parse incremental changes, you'll need a lower-level API to your parser.
A parser has a state that changes as it processes the input. For example, first the parser might be skipping spaces, now it might be reading a number, later it might be building an abstract syntax tree for an expression. If you could take a snapshot of all that parser state information at milestone points in the input, then you could reparse an incremental change by starting at the last milestone before the change (and possibly stopping earlier if the state is identical at a milestone beyond that change).
For simple syntax highlighting, like many programmers editors do, this is the approach. Syntax highlighting requires little more than tokenization, so there isn't much state to capture. And many programming languages have plenty of opportunities for milestones, e.g., at the beginning of a new line. In those cases, you might not even need to actually save the parser state, as you might know that it's always the same at the beginning of a line.
So you need an API like: