将 diff 转换为带有删除线的 Markdown?

发布于 2024-08-24 03:10:27 字数 418 浏览 10 评论 0原文

我想将 diff 的输出(在 Markdown 文件上)转换为 带有 标签的 Markdown,这样我就可以看到有什么 已从文档的新版本中删除或添加到新版本的文档中。 (这种 处理对于法律文件来说非常常见。)

希望输出的示例:

我们为什么我们学习编程语言 按顺序到...

众多之一 困难是 diff 的输出是面向行的,我想 看到个别单词的差异。有没有人有建议 使用什么算法,或者基于什么软件构建?

I'd like to convert the output of diff (on a Markdown file) to
Markdown with <strike> and <em> tags, so that I can see what has
been removed from or added to a new version of a document. (This kind of
treatment is very common for legal documents.)

Example of hoped-for output:

Why do weWe study programming languages? notNot in order to ...

One of the many
difficulties is that diff's output is line-oriented, where I want to
see differences in individual words. Does anyone have suggestions as
to what algorithm to use, or what software to build on?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

波浪屿的海角声 2024-08-31 03:10:27

使用wdiff。它已经完成了您想要的逐字比较;将其输出转换为 markdown 应该只需要一些简单的正则表达式。

例如:

$ cat foo
Why do we study programming languages?  Not in order to
$ cat bar
We study programming languages not in order to
$ wdiff foo bar
[-Why do we-]{+We+} study programming [-languages?  Not-] {+languages not+} in order to
$ wdiff foo bar | sed 's|\[-|<em>|g;s|-]|</em>|g;s|{+|<strike>|g;s|+}|</strike>|g'
<em>Why do we</em><strike>We</strike> study programming <em>languages?  Not</em> <strike>languages not</strike> in order to

编辑:实际上,wdiff 有一些选项可以使其变得更加容易:

$ wdiff -w '<em>' -x '</em>' -y '<strike>' -z '</strike>' foo bar
<em>Why do we</em><strike>We</strike> study programming <em>languages?  Not</em> <strike>languages not</strike> in order to

Use wdiff. It already does the word-by-word comparison you're looking for; converting its output to markdown should take just a few simple regular expressions.

For example:

$ cat foo
Why do we study programming languages?  Not in order to
$ cat bar
We study programming languages not in order to
$ wdiff foo bar
[-Why do we-]{+We+} study programming [-languages?  Not-] {+languages not+} in order to
$ wdiff foo bar | sed 's|\[-|<em>|g;s|-]|</em>|g;s|{+|<strike>|g;s|+}|</strike>|g'
<em>Why do we</em><strike>We</strike> study programming <em>languages?  Not</em> <strike>languages not</strike> in order to

Edit: Actually, wdiff has some options that make it even easier:

$ wdiff -w '<em>' -x '</em>' -y '<strike>' -z '</strike>' foo bar
<em>Why do we</em><strike>We</strike> study programming <em>languages?  Not</em> <strike>languages not</strike> in order to
放飞的风筝 2024-08-31 03:10:27

使用 Markdown-Diff将 diff 一词注释到您的原始文档中。它以 Markdown 格式格式化 wdiffgit --word-diff 的输出,因此您可以使用您最喜欢的 Markdown 预览器或编译器来查看更改。 (Markdown-Diff 是我自己编写的,灵感来自 Adam Rosenfield 的回答。)

Use Markdown-Diff to have the word diff annotated to your original document. It formats wdiff or git --word-diff's output in Markdown, so you can use your favorite Markdown previewer or compiler to review changes. (Markdown-Diff was written by myself, inspired by Adam Rosenfield's answer.)

心碎无痕… 2024-08-31 03:10:27

您没有指定目标平台,但假设您使用的是 .NET,您一定应该查看 CodeProject 上的这篇文章
http://www.codeproject.com/KB/recipes/diffengine.aspx

diff 引擎执行比较并返回逻辑对象,该对象可以应用您自己的视觉显示格式。我已经在几个项目中使用了它,其中一个是基于网络的文本比较,我们能够像您上面想要的那样引入所有这些标记。我还使用新类扩展了引擎以进行自定义线型比较。

You didnt specify the target platform, but assuming if you are using .NET you should definitely check out this article on CodeProject
http://www.codeproject.com/KB/recipes/diffengine.aspx

The diff engine performs comparison and return you the logical object which can apply your own visual display formatting to it. I have used it in several projects one of which was a web based text comparison and we were able to introduced all those markup like you wanted above. I have also extend the engine with new classes to do custom line type comparisons.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文