WPF RichTextBox TextChanged 事件 - 如何查找已删除或插入的文本?

发布于 2024-08-20 04:44:41 字数 636 浏览 3 评论 0原文

在使用 RichTextBox 创建自定义编辑器时,我遇到了使用 TextChanged 事件提供的信息查找已删除/插入文本的问题。

TextChangedEventArgs 的实例有一些有用的数据,但我想它并不能满足所有需求。假设插入多个段落的场景,同时选定的文本(本身跨越多个段落)已被删除。

通过 TextChangedEventArgs 实例,您拥有文本更改的集合,每个更改仅向您提供删除或添加的符号的数量及其位置。

我想到的唯一解决方案是保留文档的副本,并对其应用给定的更改列表。但是由于 TextChange 的实例只提供了插入/删除符号的数量(而不是符号),因此我们在转换原始副本时需要放置一些特殊符号(例如“?”)来表示未知符号文档。

将所有更改应用到文档的原始副本后,我们可以将其与 RichTextBox 的更新文档进行比较,并找到未知符号与真实符号之间的映射。最后,得到我们想要的!

以前有人尝试过这个吗?我需要您对整个策略的建议,以及您对这种方法的看法。

问候

While creating a customized editor with RichTextBox, I've face the problem of finding deleted/inserted text with the provided information with TextChanged event.

The instance of TextChangedEventArgs has some useful data, but I guess it does not cover all the needs. Suppose a scenario which multiple paragraphs are inserted, and at the same time, the selected text (which itself spanned multiple paragraphs) has been deleted.

With the instance of TextChangedEventArgs, you have a collection of text changes, and each change only provides you with the number of removed or added symbols and the position of it.

The only solution I have in mind is, to keep a copy of document, and apply the given list of changes on it. But as the instances of TextChange only give us the number of inserted/removed symbols (and not the symbols), so we need to put some special symbol (for example, '?') to denote unknown symbols while we transform our original copy of document.

After applying all changes to the original copy of document, we can then compare it with the richtextbox's updated document and find the mappings between unknown symbols and the real ones. And finally, get what we want !!!

Anybody has tried this before? I need your suggestions on the whole strategy, and what you think about this approach.

Regards

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

苍风燃霜 2024-08-27 04:44:41

这主要取决于您对文本更改的使用。当序列包含插入和删除时,理论上不可能知道每个插入的细节,因为插入的一些符号可能随后被删除。因此,您必须选择您真正想要的结果:

  • 出于某些目的,您必须知道更改的确切顺序,即使某些插入的符号必须保留为“?”。
  • 出于其他目的,您必须确切地知道新文本与旧文本有何不同,但不知道更改的确切顺序。

我将采用技巧来实现这些结果。我过去曾使用过这两种技术,所以我知道它们是有效的。

获取确切的顺序

如果您要实现历史记录或撤消日志或搜索特定操作,则这更合适。

对于这些用途,您描述的过程可能是最好的,有一个可能的更改:而不是“查找未知符号和真实符号之间的映射”,只需向前运行扫描以查找每个“删除”的文本,然后运行它向后查找每个“插入”的文本。

换句话说:

  1. 从初始文本开始并按顺序处理更改。对于每个插入,插入“?”符号。对于每次删除,删除指定数量的符号并将其记录为删除的文本。

  2. 从最终文本开始,并以相反的顺序处理更改。对于每个删除,插入“?”符号。对于每个插入,删除指定数量的符号并将它们记录为插入的文本。

完成此操作后,据我们所知,所有“插入”和“删除”更改条目都将具有关联文本,并且插入并立即删除的任何文本都将是“?”符号。

获取差异

这更适合修订标记或版本比较。

对于这些用途,只需使用文本更改信息来计算可能在其中找到更改的一组整数范围,然后使用标准 diff 算法来查找实际更改。这在处理增量更改方面往往非常有效,但仍然为您提供最佳更新。

当您粘贴与原始段落几乎相同的替换段落时,这特别好:使用文本更改信息将指示整个段落是新的,但使用 diff(即此技术)将仅标记那些实际上是的符号运行不同的。

计算变化范围的代码很简单:将变化表示为四个整数(oldstart、oldend、newstart、newend)。运行每个更改:

  1. 如果changestart在newstart之前,则将newstart减少到changestart并减少等量的oldstart
  2. 如果changeend在newend之后,则将newend增加到changeend并增加等量的oldend

一旦完成,从中提取范围[oldstart,oldend]旧文档和新文档的范围 [newstart, newend],然后使用标准 diff 算法来比较它们。

It primarily depends on your use of the text changes. When the sequence includes both inserts and deletes it is theoretically impossible to know the details of each insert, since some of the symbols inserted may have subsequently been deleted. Therefore you have to choose what results you really want:

  • For some purposes you must to know the exact sequence of changes even if some of the inserted symbols must be left as "?".
  • For other purposes you must know exactly how the new text differs from the old but not the exact sequence in which the changes were made.

I will techniques to achieve each of these results. I have used both techniques in the past, so I know they are effective.

To get the exact sequence

This is more appropriate if you are implementing a history or undo log or searching for specific actions.

For these uses, the process you describe is probably best, with one possible change: Instead of "finding the mappings between the unknown symbols and the real ones", simply run the scan forward to find the text of each "Delete" then run it backward to find the text of each "Insert".

In other words:

  1. Start with the initial text and process the changes in order. For each insert, insert '?' symbols. For each delete, remove the specified number of symbols and record them as the text deleted.

  2. Start with the final text and process the changes in reverse order. For each delete, insert '?' symbols. For each insert, remove the specified number of symbols and record them as the text inserted.

When this is complete, all of your "Insert" and "Delete" change entries will have the associated text to the best of our knowledge, and any text that was inserted and immediately deleted will be '?' symbols.

To get the difference

This is more appropriate for revision marking or version comparison.

For these uses, simply use the text change information to compute a set of integer ranges in which changes might be found, then use a standard diff algorithm to find the actual changes. This tends to be very efficient in processing incremental changes but still gives you the best updates.

This is particularly nice when you paste in a replacement paragraph that is almost identical to the original: Using the text change information will indicate the whole paragraph is new, but using diff (ie. this technique) will mark only those symbol runs that are actually different.

The code for computing the change range is simple: Represent the change as four integers (oldstart, oldend, newstart, newend). Run through each change:

  1. If changestart is before newstart, reduce newstart to changestart and reduce oldstart an equal amount
  2. If changeend is after newend, increase newend to changeend and increase oldend an equal amount

Once this is done, extract range [oldstart, oldend] from the old document and the range [newstart, newend] from the new document, then use the standard diff algorithm to compare them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文