C++ 的基本字符串差异测试用例
我有一个 C++ 函数,它返回多行 std::string
。在测试用例中,我将每一行与已知值进行比较 - 类似于:
std::string known = "good\netc";
std::string output = "bad\netc";
std::vector<std::string> knownvec;
pystring::splitlines(known, knownvec); // splits on \n
std::vector<std::string> outvec;
pystring::splitlines(output, outvec);
CHECK_EQUAL(osvec.size(), resvec.size());
for(unsigned int i = 0; i < std::min(outvec.size(), knownvec.size()); ++i)
CHECK_EQUAL(pystring::strip(outvec[i]), pystring::strip(knownvec[i]));
这有效,但假设添加了一个换行符,所有后续的 CHECK_EQUAL 断言都会失败,这使得输出难以
读取比较两个字符串的更好方法,理想情况下以一种良好的、独立的方式(即不链接到 Giantdifflib,或将字符串写入文件并调用 diff
命令!)
[编辑] I正在使用OpenImageIO 相当简单的unittest.h
正在比较的数据是主要是 YAML 或颜色查找表。 这是一个示例测试用例 - 基本上几行标题,然后是很多数字:
Version 1
Format any
Type ...
LUT:
Pre {
0.0
0.1
...
1.0
}
3D {
0.0
0.1
...
1.0
}
I have a C++ function which returns a multi-line std::string
. In the test-case for this, I compare each line against the known-value - something like:
std::string known = "good\netc";
std::string output = "bad\netc";
std::vector<std::string> knownvec;
pystring::splitlines(known, knownvec); // splits on \n
std::vector<std::string> outvec;
pystring::splitlines(output, outvec);
CHECK_EQUAL(osvec.size(), resvec.size());
for(unsigned int i = 0; i < std::min(outvec.size(), knownvec.size()); ++i)
CHECK_EQUAL(pystring::strip(outvec[i]), pystring::strip(knownvec[i]));
This works, but say a single new-line is added, all subsequent CHECK_EQUAL assertions fail, which is make the output hard to read
Is there a better way to compare the two strings, ideally in a nice, self-contained way (i.e not linking against giantdifflib, or writing the strings to a file and calling the diff
command!)
[Edit] I'm using OpenImageIO's rather simple unittest.h
The data being compared is mainly either YAML, or colour lookup tables. Here's an example test case - basically a few lines of headers, then lots of numbers:
Version 1
Format any
Type ...
LUT:
Pre {
0.0
0.1
...
1.0
}
3D {
0.0
0.1
...
1.0
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最简单的方法是当字符串不再匹配时跳出循环:
如果 CHECK_EQUAL 返回布尔值,那么您显然可以稍微简化上面的示例。
如果希望您的单元测试框架在比较多行字符串时提供与
diff
相同的输出,那么恐怕您对单元测试框架的期望过高。如果您不想链接到外部库,或从测试程序中执行diff
,那么您必须自己编写某种diff
算法。查看有关 diff 算法和库信息的其他问题。
如果您发现自己实现 diff 算法不值得(可能不值得),那么请查看 Google Diff-Match-Patch 库。
The easiest thing to do would be to break out of your loop when strings no longer match:
If
CHECK_EQUAL
returns a boolean value, then you can obviously simplify the above example a bit.If want your unit test framework to provide the same output as
diff
when comparing multi-line strings, then I'm afraid you're expecting too much out of your unit test framework. If you don't want to link to an external library, or executediff
from within your test program, then you'll have to program some kind ofdiff
algorithm yourself.Check out this other question about information on diff algorithms and libraries.
If you find that implementing a diff algorithm yourself is not worth the trouble (it probably isn't), then check out the Google Diff-Match-Patch libraries.
简而言之:
出于单元测试的目的,您只需标记它们是不同的。单元测试不能修复失败的单元测试,程序员可以修复失败的单元测试。
Long:
如果您的序列大小可能不同,则没有一种简单、通用的方法来比较它们。我认为你需要一个巨大的difflib才能做得不好,更不用说做得足够了。
我认为如果你不能说序数不是一个身份,那么你将不得不使用搜索来添加信息。
考虑这种退化的情况:
无论您是否选择这些解决方案中的任何一个,都将归结为对结果进行评分或实施的某些工件:
我的观点是,如果您必须为结果分配分数,那么它不太可能单元测试适用。
一般来说,比较容器并不是很容易,如果结果不能按字典顺序排序,我不确定任何计算结果除了告诉您它的不同之外是否还会提供信息。
显然,这是一个值得思考的有趣问题,但它可能超出了单元测试的范围。
Short:
For the purposes of unit testing, you just need to flag that they are different. Unit tests don't fix failing unit tests, programmers fix failing unit tests.
Long:
If your sequence sizes are possibly different, there isn't a simple, generic way to compare them. I think you'll need a giantdifflib to do it poorly, let alone adequately.
I think if you can't say that the ordinal is not an identity, then you are going to have to use search to add information.
Consider this degenerative case:
Whether or not you choose either one of these solutions is going to come down to scoring the results or some artifact of the implementation:
My opinion is that If you have to assign a score to a result, then it is unlikely that a unit test is applicable.
Comparing containers isn't very easy in general, if the result cannot be lexicographically sorted, I'm not sure that any computational result will be informative beyond telling you that its different.
This is a fun problem to think about obviously, but it is probably out of scope of unit testing.
基本的 diff 算法即使不是非常高效,但也很容易实现。 这篇维基百科文章是一个很好的起点。
A basic diff algorithm is rather easy to implement, if not terribly efficient. This Wikipedia article is a good starting place.