换行忽略差异/跨多行差异/回流忽略差异

发布于 2024-08-28 05:01:01 字数 683 浏览 4 评论 0原文

有谁知道类似 diff 的工具可以显示两个文本文件之间的更改,但忽略空白包括换行符中的更改?

这是一个例子:

the quick brown fox jumped over the lazy bear.  the quick brown fox
jumped over the lazy bear.  the quick brown fox jumped over the lazy
bear.  the quick brown fox jumped over the lazy bear.
quick brown fox jumped over the lazy bear.  the quick brown fox jumped
over the lazy bear.  the quick brown fox jumped over the lazy bear.
the quick brown fox jumped over the lazy bear.

我所做的只是删除一个单词并重排它,但是“diff -b”检测到每一行的更改(正如它应该的那样;我并不是说这是 diff 中的错误)。但对于大型 LaTeX 文件来说,这是一个主要问题;改变一长段中的一个单词,你得到的差异基本上是无用的。

顺便说一句,我知道这比通常的行原子差异需要更多的计算能力。我只对人工生成的小文件执行此操作,并且如果必须的话,我很乐意等待很长时间。

Does anybody know of a diff-like tool that can show me the changes between two text files, but ignore changes in whitespace including newlines?

Here's an example:

the quick brown fox jumped over the lazy bear.  the quick brown fox
jumped over the lazy bear.  the quick brown fox jumped over the lazy
bear.  the quick brown fox jumped over the lazy bear.
quick brown fox jumped over the lazy bear.  the quick brown fox jumped
over the lazy bear.  the quick brown fox jumped over the lazy bear.
the quick brown fox jumped over the lazy bear.

All I did was delete one word and reflow it, but "diff -b" detects a change on every line (as it should; I'm not saying this is a bug in diff). But for large LaTeX files this is a major problem; change one word in a long paragraph and the diff you get back is basically useless.

By the way, I'm aware that this requires way more computational power than the usual lines-are-atomic diff. I'm only doing this on small human-generated files and am happy to wait a long time if I have to.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

橘亓 2024-09-04 05:01:01

wdiff 进行逐字对齐。

为了在终端中进行易于阅读的显示,请运行

 wdiff -al <file1> <file2> | less

这将显示(至少在我的系统上)以粗体显示的 插入内容和 中的删除内容> 下划线。

wdiff does word-by-word alignment.

For an easy-to-read display in a terminal, run

 wdiff -al <file1> <file2> | less

This will show (at least on my system) insertions in <file2>boldfaced and deletions from <file2> underlined.

七月上 2024-09-04 05:01:01

一种选择是将整个文件拆分为单词来实现此目的。在了解上下文方面,结果并非 100% 相同,但会根据您关心的变化类型进行微调。

示例:

cat file1 | perl5.8 -e '{s/\s+/\n/g;}' > file1.split_words
cat file2 | perl5.8 -e '{s/\s+/\n/g;}' > file2.split_words
diff file1.split_words file2.split_words

如果文本具有特殊属性,您可以做得更好,更具体地说,重排仅发生在定义为连续 2 个换行符的段落范围内 - 只需将所有单个换行符替换为空格并运行常规diff -w 结果。

One option is to do this by splitting the entire file into words. Not 100% the same result in terns of knowing the context but very fine-tuned to the type of change you care about.

Example :

cat file1 | perl5.8 -e '{s/\s+/\n/g;}' > file1.split_words
cat file2 | perl5.8 -e '{s/\s+/\n/g;}' > file2.split_words
diff file1.split_words file2.split_words

You can do even better if the text has special properies, to be more specific, the reflow only happens within the bounds of a paragraph which is defined as 2 newlines in a row - simply replace all the single newlines with spaces and run regular diff -w on results.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文