差异算法 C++
我正在尝试用 C++ 创建一个能够比较两个 .txt 文件的程序。
struct line
{
string text;
size_t num;
int status;
};
void compareFiles(vector<line> &buffer_1, vector<line> &buffer_2, size_t index_1, size_t index_2)
{
while(index_1 < buffer_1.size())
{
while(index_2 < buffer_2.size())
{
X = buffer_1[index_1].text;
Y = buffer_2[index_2].text;
if(X == Y)
{
++index_1;
++index_2;
}
else
{
LCS();
string lcs = printLCS(X.length(), Y.length());
/*
* Here's my problem
*/
}
}
}
}
如您所见,我有两个缓冲区(行向量),之前加载了文件内容。我还有功能齐全的 LCS 算法(经过测试)。 LCS 适用于全局定义的字符串 X 和 Y。
所以,我真正需要做的是将缓冲区与 LCS 进行逐行比较,但我没有找到一种方法来做到这一点。
请你帮助我好吗?
I'm trying to create a program in C++ that could be able to diff two .txt files.
struct line
{
string text;
size_t num;
int status;
};
void compareFiles(vector<line> &buffer_1, vector<line> &buffer_2, size_t index_1, size_t index_2)
{
while(index_1 < buffer_1.size())
{
while(index_2 < buffer_2.size())
{
X = buffer_1[index_1].text;
Y = buffer_2[index_2].text;
if(X == Y)
{
++index_1;
++index_2;
}
else
{
LCS();
string lcs = printLCS(X.length(), Y.length());
/*
* Here's my problem
*/
}
}
}
}
As you can see, I have two buffers (vectors of lines) previously loaded with files contents . I also have the LCS algorithm fully functional (tested). LCS works on strings X and Y globally defined.
So, what I really need to do is compare buffers line by line with LCS, but I didn't manage a way to do it.
Could you please help me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当有疑问时,我通常会听取以前做过这件事的人的意见。古老的 diff 程序一直存在,并且可以做您想做的事情。此外,它是开源的,因此请访问 ftp://mirrors。 kernel.org/gnu/diffutils/diffutils-3.0.tar.gz 并查看它。
解压存档后,打开 src/analyze.c。
diff_2_files
函数从第 472 行开始。执行实际比较的代码从第 512 - 537 行运行。它们复制如下:这里的想法是加载两个相同大小的缓冲区,然后加载每个文件进入缓冲区。使用
memcmp
一次比较两个文件的一个缓冲区,并查看是否有任何缓冲区与另一个不同。如果任何缓冲区比较未返回相等,则这两个文件不同。同样重要的是要注意,您永远不必一次读取超过两个缓冲区的数据,因此这种方法也适用于大文件。When in doubt, I usually defer to someone who's done it before. The venerable diff program has been around forever, and does what you want to do. Furthermore, it is open source, so head over to ftp://mirrors.kernel.org/gnu/diffutils/diffutils-3.0.tar.gz and check it out.
Once you unzip the archive, open up src/analyze.c. The
diff_2_files
function begins on line 472. The code that does the actual comparison runs from lines 512 - 537. They are reproduced below:The idea there is to load two buffers of identical size, then load each file into a buffer. Compare the two files one buffer at a time using
memcmp
, and see if any buffer differs from the other. If any buffer comparison does not return equal, then the two files are different. It's also important to note that you never have to read more than two buffers worth of data at a time, so this approach works on huge files too.首先,我会重写 LCS() 以两行作为参数并返回最长的公共序列 - 我想象一个像 std::string LCS(const line& lhs, const 行& rhs)。然后我将修改您的
while
循环,如下所示。这将为
buffer_1
和buffer_2
中的每个行组合找到并打印最长的公共序列。这是你想做的事吗?我正确理解你的问题吗?First of all, I would rewrite
LCS()
to take two lines as parameters and return the longest common sequence—I imagine a function signature likestd::string LCS(const line& lhs, const line& rhs)
. Then I would modify yourwhile
loops as follows.This will find and print the longest common sequence for each combination of lines in
buffer_1
andbuffer_2
. Is this want you want to do? Did I understand your question correctly?