Delphi - 比较并标记字符串差异
我需要做的是比较两个字符串并用更改的开始/结束标记标记差异。示例:
this is string number one.
this string is string number two.
输出将是
this [is|string is] string number [one|two].
一段时间以来我一直在尝试解决这个问题。我发现一些我相信可以帮助我做到这一点的东西,但我无法实现这一点。
http://www.angusj.com/delphi/textdiff.html
我有大约 80% 的人在这里工作,但我不知道如何让它完全按照我想要的方式去做。任何帮助将不胜感激。
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, Diff, StdCtrls;
type
TForm2 = class(TForm)
Edit1: TEdit;
Edit2: TEdit;
Button1: TButton;
Memo1: TMemo;
Diff: TDiff;
procedure Button1Click(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end;
var
Form2: TForm2;
s1,s2:string;
implementation
{$R *.dfm}
procedure TForm2.Button1Click(Sender: TObject);
var
i: Integer;
lastKind: TChangeKind;
procedure AddCharToStr(var s: string; c: char; kind, lastkind: TChangeKind);
begin
if (kind lastkind) AND (lastkind = ckNone) and (kind ckNone) then s:=s+'[';
if (kind lastkind) AND (lastkind ckNone) and (kind = ckNone) then s:=s+']';
case kind of
ckNone: s := s + c;
ckAdd: s := s + c;
ckDelete: s := s + c;
ckModify: s := s + '|' + c;
end;
end;
begin
Diff.Execute(pchar(edit1.text), pchar(edit2.text), length(edit1.text), length(edit2.text));
//now, display the diffs ...
lastKind := ckNone;
s1 := ''; s2 := '';
form2.caption:= inttostr(diff.Count);
for i := 0 to Diff.count-1 do
begin
with Diff.Compares[i] do
begin
//show changes to first string (with spaces for adds to align with second string)
if Kind = ckAdd then
begin
AddCharToStr(s1,' ',Kind, lastKind);
end
else
AddCharToStr(s1,chr1,Kind,lastKind);
if Kind = ckDelete then
begin
AddCharToStr(s2,' ',Kind, lastKind)
end
else AddCharToStr(s2,chr2,Kind,lastKind);
lastKind := Kind;
end;
end;
memo1.Lines.Add(s1);
memo1.Lines.Add(s2);
end;
end.
我从 angusj.com 获取了 basicdemo1 并对其进行了修改以达到目前的效果。
What I need to do is compare two strings and mark the differences with begining/ending marks for changes. Example:
this is string number one.
this string is string number two.
output would be
this [is|string is] string number [one|two].
I've been trying to figure this out for some time now. And I found something I blieved would help me do this, but I am unable to make this happen.
http://www.angusj.com/delphi/textdiff.html
I have it about 80% working here, but I've got no idea how to get it to do exactly what I want it to do. Any help would be appreciated.
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, Diff, StdCtrls;
type
TForm2 = class(TForm)
Edit1: TEdit;
Edit2: TEdit;
Button1: TButton;
Memo1: TMemo;
Diff: TDiff;
procedure Button1Click(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end;
var
Form2: TForm2;
s1,s2:string;
implementation
{$R *.dfm}
procedure TForm2.Button1Click(Sender: TObject);
var
i: Integer;
lastKind: TChangeKind;
procedure AddCharToStr(var s: string; c: char; kind, lastkind: TChangeKind);
begin
if (kind lastkind) AND (lastkind = ckNone) and (kind ckNone) then s:=s+'[';
if (kind lastkind) AND (lastkind ckNone) and (kind = ckNone) then s:=s+']';
case kind of
ckNone: s := s + c;
ckAdd: s := s + c;
ckDelete: s := s + c;
ckModify: s := s + '|' + c;
end;
end;
begin
Diff.Execute(pchar(edit1.text), pchar(edit2.text), length(edit1.text), length(edit2.text));
//now, display the diffs ...
lastKind := ckNone;
s1 := ''; s2 := '';
form2.caption:= inttostr(diff.Count);
for i := 0 to Diff.count-1 do
begin
with Diff.Compares[i] do
begin
//show changes to first string (with spaces for adds to align with second string)
if Kind = ckAdd then
begin
AddCharToStr(s1,' ',Kind, lastKind);
end
else
AddCharToStr(s1,chr1,Kind,lastKind);
if Kind = ckDelete then
begin
AddCharToStr(s2,' ',Kind, lastKind)
end
else AddCharToStr(s2,chr2,Kind,lastKind);
lastKind := Kind;
end;
end;
memo1.Lines.Add(s1);
memo1.Lines.Add(s2);
end;
end.
I took the basicdemo1 from angusj.com and modified it to get this far.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
要解决您描述的问题,您本质上必须执行类似于生物序列比对<中所做的操作DNA 或蛋白质数据。如果只有两个字符串(或一个常量引用字符串),可以通过动态编程来实现基于成对对齐算法,例如 Needleman Wunsch 算法* 及相关算法。 (多序列比对变得更加复杂。)
[*编辑:链接应该是:http://en。 wikipedia.org/wiki/Needleman–Wunsch_algorithm]
编辑 2:
由于您似乎对单词级别而不是字符级别的比较感兴趣,因此您可以 (1) 拆分输入字符串转换为字符串数组,其中每个数组元素代表一个单词,然后 (2) 在这些单词的级别上执行对齐。这样做的好处是对齐的搜索空间变得更小,因此您期望它总体上更快。我相应地改编并“德尔菲化”了维基百科文章中的伪代码示例:
To solve the problem you describe, you'd essentially have to do something like what is done in biological sequence alignment of DNA or protein data. If you only have two strings (or one constant reference string), it can approached by dynamic programming-based pairwise alignment algorithms such as the Needleman Wunsch algorithm* and related algorithms. (Multiple sequence alignment gets much more complicated.)
[* Edit: link should be: http://en.wikipedia.org/wiki/Needleman–Wunsch_algorithm ]
Edit 2:
Since you seem to be interested in comparing at the level of words rather than characters, you could (1) split the input strings into arrays of strings, where each array element represents a word and then (2) carry out the alignment at the level of these words. This has the benefit of the search space for the alignment becoming smaller and thus you'd expect it to be faster overall. I have adapted and 'Delphified' the pseudo-code example from the wikipedia-article accordingly:
有一个 Object Pascal Diff Engine 可用这可能会有帮助。您可能希望将每个“单词”分成单独的行进行比较,或者修改算法以逐个单词进行比较。
There is an Object Pascal Diff Engine available which might be of assistance. You might want to break each "word" into a separate line for the comparison, or modify the algorithm to compare on a word by word basis.