编辑文本缓冲区

发布于 2024-07-06 02:18:21 字数 1560 浏览 8 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

身边 2024-07-13 02:18:21

您几乎可以使用任何数据结构来编写文本编辑器。 两百万个字符是相当厚的小说的打字量,您可以在不到十分之一秒的时间内轻松地向上/向下移动它们(用于在简单数组中插入/删除)。 不要听任何人告诉你不要建造一个,你会得到在所有小细节上都完全正确的东西。

我写了我的,在我浏览了太多网页之后,我已经习惯了向上/向下翻页与单击滚动条拇指上方/下方相同。 当您在普通编辑器中键入字符时,跳回到开始滚动条导航之前,这对我来说太烦人了,所以我自己编写了。

如果我要进行重写(我只是对当前版本中的每个文本缓冲区使用 delphi ansistrings,并嵌入换行符),我会为每个字符使用整数或 int64,并对块开始/停止、光标位置和行进行编码标记位于高位,这样当您插入或删除内容时就不必调整指针。

You can use almost any data structure to write a text editor. Two million characters is fairly thick novel's worth of typing and you can easily move them up/down (for an insert/delete in a simple array) in less than one-tenth of a second. Don't listen to anyone who tells you not build one, you'll get something which works exactly right in all the small details.

I wrote mine, after I'd done too much web browsing and I'd got used to page up/down being the same as clicking above/below the scrollbar thumb. The jump back to before you started scrollbar navigating when you typed a character in a normal editor, just got too annoying for me, so I wrote my own.

If I was going to do a rewrite (I just used delphi ansistrings for each text buffer in the current version, with newline characters embedded), I'd use integers or int64s for each character and encode block start/stop, cursor position and line markers in the high bits, that way you don't have to adjust pointers when you insert or delete things.

狼性发作 2024-07-13 02:18:21

您的主要数据结构是包含文本的数据结构。 您可能需要一个行数组,而不是使用长缓冲区来包含文本,因为将字符插入到行的中间比将字符插入到大缓冲区的中间要快。

您需要决定您的文本编辑器是否应支持嵌入格式。 例如,如果您需要使用字体、粗体、下划线等,那么您的数据结构将需要包括在文本中嵌入格式化代码的方法。 在 8 位字符的美好时代,我们可以使用整数的高 8 位来存储任何格式标志,并使用低 8 位来存储字符本身。

实际的代码将取决于您使用的语言。 在 C# 或 C++ 中,您可能会使用字符串数组来表示行。 在 C 中,您将拥有一个基于堆的字符数组的数组。

尽可能将显示代码与文本处理代码分开。 代码的中心将是一个紧密的循环,如下所示:

while (editing) {
    GetCharacter();
    ProcessCharacter();
    UpdateDisplay();
}

更复杂的编辑器将使用单独的线程来进行字符获取/处理和显示更新。

Your primary data structure is one to contain the text. Rather than using a long buffer to contain the text, you'll probably want an array of lines because it's faster to insert a character into the middle of a line then it is to insert a character into the middle of a large buffer.

You'll need to decide if your text editor should support embedded formatting. If, for example, you need to use fonts, bolding, underlining, etc, then your data structure will need to include ways of embedding formatting codes within your text. In the good old days of 8-bit characters we could use the upper 8-bits of an integer to store any formatting flags and the lower 8-bits to store the character itself.

The actual code will depend on the language you're using. In C# or C++ you'll probably use an array of strings for the lines. In C you'll have an array of heap-based character arrays.

Separate out the display code from the text handling code as much as possible. The center of your code will be a tight loop something like:

while (editing) {
    GetCharacter();
    ProcessCharacter();
    UpdateDisplay();
}

A more sophisticated editor will use separate threads for the character getting/processing and the display updating.

不甘平庸 2024-07-13 02:18:21

这实际上取决于您的设计。 几年前,我使用curses编写了一个小编辑器。 我使用了双向链表,其中每个节点都是一个字符(这是一个相当浪费的设计..但它使格式化和屏幕刷新例程非常容易)。

我的朋友使用的其他数据结构是(这是一个家庭作业项目):
1)数组链表,每个数组代表一行。
2)一个二维链表(只是编造的名字)..它是一个字符链表,但每个字符都链接到上面和下面的字符。
3)链表数组

但是,我建议您查看一些简单编辑器(例如pico)的源代码,看看它们使用的是什么ds。

This really depends on your design. Couple of years back, I wrote a small editor using curses. I used doubly linked list where each node was a character (quite a wasteful design.. but it makes formatting and screen refresh routines really easy).

Other data structures used by my friends were (this was a homework project):
1)linked list of arrays with each array representing a line.
2)a 2D linked list (just made up that name).. it was a linked list of characters but each character was linked to the character above and below.
3)Array of linked list

However, I would suggest you to go through the source code of some simple editors like pico to see what ds they are using.

遥远的她 2024-07-13 02:18:21

您检查过 Scintilla 的源代码吗?

Have you checked out Scintilla's source code?

抚笙 2024-07-13 02:18:21

看看 vim,它是开源的。 仔细研究它,看看它如何处理你想要的东西。

Check out vim, it's open-source. Poke around in it to see how it handles what you want.

眼眸里的那抹悲凉 2024-07-13 02:18:21

我曾经在一家主要产品是文本编辑器的公司工作。 虽然我主要致力于它的脚本语言,但编辑器本身的内部设计自然是一个主要讨论话题。

看起来它分成了两条总体思路。 一种是您单独存储每一行​​,然后将它们链接到链表或您满意的其他整体数据结构中。 优点是任何面向行的编辑操作(例如删除整行或移动文件中的行块)都非常容易实现,因此速度快如闪电。 缺点是加载和保存文件需要更多的工作,因为您必须遍历整个文件并构建这些数据结构。

当时的另一个思路是,在文本没有更改的情况下,尝试将文本块保持在一起,而不管它们是否换行,仅根据编辑的需要将它们分解。 优点是可以很容易地将未经编辑的文件块删除为一个文件。 加载文件、更改一行并保存文件的简单编辑速度非常快。 缺点是面向行或列块操作执行起来非常耗时,因为您必须解析这些文本块并移动大量数据。

无论什么价值,我们始终坚持以行为导向的设计,我们的产品被认为是当时最快的编辑器之一。

I used to work for a company whose main product was a text editor. While I mainly worked on the scripting language for it, the internal design of the editor itself was naturally a major topic of discussion.

It seemed like it broke down into two general trains of thought. One was that you stored each line by itself, and then link them together in a linked list or other overall data structure that you were happy with. The advantage was that any line-oriented editing actions (such as deleting an entire line, or moving a line block within a file) were beyond trivial to implement and therefore lightning fast. The down side was that loading and saving the file took a bit more work, because you'd have to traverse the entire file and build these data structures.

The other train of thought at that time was to try to keep hunks of text together regardless of line breaks when they hadn't been changed, breaking them up only as required by editing. The advantage was that an unedited hunk of the file could be blasted out to a file very easily. So simple edits where you load a file, change one line, and save the file, were super fast. The disadvantage was that line-oriented or column-block operations were very time consuming to execute because you would have to parse through these hunks of text and move alot of data around.

We always stuck with the line-oriented design, for whatever that is worth, and our product was considered one of the fastest editors at the time.

故事还在继续 2024-07-13 02:18:21

“四人帮”一书(设计模式)有一个基于 GUI 的文本编辑器,因为它示例的主要来源,是一本值得拥有的书。

一般的“纯文本”编辑器可能使用绳索,SGI的STL有一个实现 的。 基本上,它们是字符缓冲区的链接列表。 这样,插入/删除字符涉及更改较小的缓冲区和一些指针,而不是将整个文档存储在单个缓冲区中并必须移动所有内容。

The "Gang of Four" book (Design Patterns) has a GUI-based text editor as it's main source of examples and is a worthwhile book to own.

The general "pure text" editor probably uses ropes, which SGI's STL has an implementation of. Basically, they are a linked list of character buffers. That way, inserting/deleting characters involves changing smaller buffers and a few pointers, instead of storing the entire document in a single buffer and having to shift everything.

山人契 2024-07-13 02:18:21

这是 2008 年。不要写文本编辑器;要写文本编辑器。 你正在重新发明火。

还在? 我不确定这是否适用或您计划支持哪些平台,但Neatpad 系列教程 是开始考虑编写文本编辑器的好地方。 他们专注于 Win32 作为基本平台,但许多经验教训将适用于任何地方。

This is 2008. Don't write a text editor; you're reinventing fire.

Still here? I'm not sure if this applies or what platforms you plan to support, but the Neatpad series of tutorials is a great place to start thinking about writing a text editor. They focus on Win32 as the basic platform, but many of the lessons learned will apply anywhere.

月下伊人醉 2024-07-13 02:18:21

我最喜欢的解决方案是间隙缓冲区,因为它非常容易实现并且具有良好的摊销能力效率。 只需使用单个字符数组,并将一个区域指定为间隙。 一旦理解了这个概念,代码就几乎自然而然地出现了。

您还需要一个辅助数组 [vector] 来跟踪每行开头的索引,以便您可以轻松提取特定的文本行。 仅当间隙移动或插入/删除换行符时才需要更新辅助数组。

My favorite solution is the gap buffer, because it's pretty easy to implement and has good amortized efficiency. Just use a single array of characters, with a region designated as the gap. Once you understand the concept, the code follows almost naturally.

You also need an auxilliary array [vector<int>] to track the index of the beginning of each line--so that you can easily extract a particular line of text. The auxilliary array only needs to be updated when the gap moves, or when a newline is inserted/removed.

厌味 2024-07-13 02:18:21

这两个在线文档为文本编辑器提供了一个小而有用的“众所周知的”数据结构/技术的聚宝盆。

  1. 文本序列的数据结构描述并实验分析了一些数据结构,倾向于将片段表作为数据结构的选择。 然而,Net.wisdom 似乎倾向于间隙缓冲区,因为它足以进行文本编辑,并且更易于实现/调试。
  2. “文本编辑的技巧”(www.finseth.com/craft/)是一部较老的著作,它不仅仅涉及数据结构,而且面向 Emacs 风格的编辑器; 但这些概念通常是有用的。

These two online documents present a small, but useful cornucopia of "well-known" data structures/techniques for text editors.

  1. Data Structures for Text Sequences describes, and experimentally analyses a few data structures, leaning towards piece tables as the data structure of choice. Net.wisdom however seems to lean towards gap buffers as being more than adequate for text editing, and simpler to implement/debug.
  2. "The craft of text editing" (www.finseth.com/craft/) is an older work, and addresses more than just data structures, and is oriented towards Emacs-style editors; but the concepts are generally useful.
毅然前行 2024-07-13 02:18:21

一种简单的方法是面向行的——将文件表示为 char/wchar_t 数组/向量的数组/向量,每行一个。 插入和删除按照您期望的方式工作,尽管行尾是一种特殊情况。

我会从这里开始,并可能在其他一切工作完成后用更有效地支持长行插入/删除的东西替换行数据结构。

A simple approach would be line oriented -- represent the file as an array/vector of char/wchar_t arrays/vectors, one per line. Insertions and deletions work the way you'd expect, although end of line is a special case.

I'd start with that and possibly replace the line data structure with something more efficiently supporting inserts/deletes on long lines after you have everything else working.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文