文本编辑 LinkedList 与 List of Lines

发布于 2024-08-16 11:23:55 字数 211 浏览 2 评论 0原文

我正在使用 C# 编写一个文本编辑器,我知道我只是在重新发明轮子,但这是一次学习经历,也是我想做的事情。

现在我有一个基本的文本文档,使用类似于间隙缓冲区的东西,但是我必须更新行缓冲区以在每次对缓冲区进行编辑时保存每行的开头。

我正在考虑创建另一个文本文档以使用行列表进行测试并编辑每一行。

现在我的问题是使用 LinkedList 与标准列表相比有什么好处?

I'm writing a text editor using C#, I know i'm just reinventing the wheel but it's a learning experience and something I want to do.

Right now I have a basic Text Document using something that resembles a Gap Buffer however I have to update my line buffer to hold the start of each line each time an edit is made to the buffer.

I am looking at creating another Text Document for testing using a list of lines and editing each line instead.

Now the question that I have is what would be the benefits of using a LinkedList vs a standard List?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

紫南 2024-08-23 11:23:55

行的链接列表将快速插入新行或删除行,并且快速从特定行向下移动(如果有双向链表,则向上移动)少量行,并且快速移动到文档的开头。要快速移至文档末尾,您还需要存储链接列表的末尾。不过,转到特定行号的速度相对较慢,因为您必须从头开始并遍历各行,尽管除非您的文档有很多行,否则这应该不是问题。

普通列表可以快速移动到特定行号,但添加或删除除末尾以外的任何行的速度很慢,因为每次插入或删除行时都需要复制整个缓冲区。

为了编辑大型文档,我更喜欢使用链接列表而不是基于数组的列表。在任何一种情况下,如果文档包含任何极长的行,您都可能会遇到问题,因为字符串是不可变的,并且更改每个字符会很慢。

A linked list of lines will be fast to insert new lines or delete lines, and fast to move down (and up if you have a doubly-linked list) from a specific line by a small number of lines, and fast to move to the start of the document. To move quickly to the end of the document you will also need to store the end of the linked list. It is relatively slow to go to a specific line number though, as you will have to start from the beginning and iterate over the lines, although this shouldn't be a problem unless your documents have very many lines.

An ordinary list is fast to move to a specific line number but slow to add or remove any line except at the end as the entire buffer will need to be copied each time a line is inserted or deleted.

I would prefer a linked list over an array based list for the purposes of editing large documents. In either case you may have problems if the document contains any extremely long lines as strings are immutable and changing each character will be slow.

牛↙奶布丁 2024-08-23 11:23:55

我使用的是每行一个字符串数组。数组通常比链表更新更快或等于更新,因为它们比链表具有更好的缓存局部性,并且对于单个一级缓存未命中,您可以在数组中移动几十个指针项。我想说,对于少于 10000 个项目的所有内容,只需使用指针数组。

如果您编辑的文本很小(例如手写的源代码文件),那么这是正确的选择。最先进的文本编辑所需的许多元信息可以很好地拆分为“行的开头”点。最重要的是语法突出显示。

还有错误行、断点、分析信息、代码覆盖标记,所有这些都在一行上工作。它是源代码文本的本机结构,在相同情况下也适用于文献文本(如果您编写文本处理器而不是源代码编辑器),但在这种情况下,您需要将段落作为一个单元。

如果我有时间重新设计我的编辑器,我将添加不同的缓冲区实现,因为在较大的文本上,所有行信息数据的开销(在 32 位系统上每行大约 80 字节)非常重要。那么间隙缓冲区模型会更好,如果根本没有行,例如以十六进制模式显示二进制文件时,它也会好得多。

最后,当您允许用户打开大文件时,需要第三个缓冲区模型。有趣的是,看到关于无限文件大小编辑的营销废话(免费开源在这里更糟糕),一旦你打开 400 MB 的日志文件,整个系统就会变得没有响应。您在这里需要一个缓冲区模型,它不会首先将整个文件加载到缓冲区中。

I'm using a one string per line array. Arrays are often faster or equal to update then linked list because they have a much better cache locality then linked list and for a single 1st level cache miss you can move already a few dozens pointer items in an array. I would say for everything less then 10000 items just use an array of pointers.

If your edited text are small (like hand written source code files) it is the way to go. A lot of the meta information you need for state of the art text editing can be split very well into "beginning of a line" points. Most importantly syntax highlighting.

Also error lines, breakpoints, profiling infos, code coverage markers, all of them work on a line. It's the native structure of source code text and in same cases also for literature text (if you write a text processor and not a source code editor) but in that case your need to take a paragraph as a unit.

If i ever find the time to redesign my editor i will add different buffer implementations because on larger texts the overhead of all the line info data (it's about 80 byte per line on a 32bit system) is significant. Then a gap buffer model is better, it is also much better if you don't have lines at all for example when displaying binary files in hex mode.

And finally a third buffer model is required when you allow a user to open large files. It's funny to see marketing bullshit (free open source is surprisingly worse here) about unlimited file size editing and once you open a 400 MB log file, the whole system becames unresponsive. You need a buffer model here which is not loading the whole file into the buffer first.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文