检查 JUnit 测试中的深度相等性
我正在为克隆、序列化和/或写入 XML 文件的对象编写单元测试。在所有三种情况下,我想验证生成的对象是否与原始对象“相同”。我的方法经历了多次迭代,并发现了所有这些方法的错误,想知道其他人做了什么。
我的第一个想法是在所有类中手动实现 equals 方法,并使用assertEquals。在决定重写 equals 对可变对象执行深度比较是一件坏事之后,我放弃了这种方法,因为您几乎总是希望集合对其包含的可变对象使用引用相等性[1]。
然后我想我可以将该方法重命名为 contentEquals 或其他名称。然而,经过更多思考,我意识到这并不能帮助我找到我正在寻找的回归。如果程序员添加了一个新的(可变)字段,并且忘记将其添加到克隆方法中,那么他可能也会忘记将其添加到 contentEquals 方法中,并且我正在编写的所有这些回归测试都将毫无价值。
然后,我编写了一个漂亮的assertContentEquals 函数,它使用反射来检查对象的所有(非瞬态)成员的值,如果需要,可以递归地检查。这避免了上述手动比较方法的问题,因为它默认假设必须保留所有字段,并且程序员必须显式声明要跳过的字段。然而,在某些合理的情况下,克隆后的字段确实不应该相同[2]。我添加了一个额外的参数 toassertContentEquals 来列出要忽略的字段,但由于该列表是在单元测试中声明的,因此在递归检查的情况下它会变得非常丑陋。
因此,我现在考虑重新在每个正在测试的类中包含一个 contentEquals 方法,但这次使用类似于上述的assertContentsEquals 的辅助函数来实现。这样,当递归操作时,将在每个单独的类中定义豁免。
有什么意见吗?您过去是如何处理这个问题的?
编辑阐述我的想法:
[1]我从这个 文章。一旦你将一个可变对象放入 Set/Map 中,如果一个字段发生变化,那么它的哈希值将会改变,但它的存储桶不会改变,从而破坏事情。因此,选项是不要覆盖可变对象上的 equals/getHash,或者制定政策,一旦将可变对象放入集合中,就不再更改可变对象。
我没有提到我正在现有代码库上实施这些回归测试。在这种情况下,改变 equals 的定义,然后必须找到所有可以改变软件行为的实例的想法让我感到害怕。我觉得我很容易破坏的东西比修复的东西还要多。
[2]我们的代码库中的一个示例是图形结构,其中每个节点都需要一个唯一的标识符,用于在最终写入 XML 时链接节点 XML。当我们克隆这些对象时,我们希望标识符不同,但其他一切保持不变。经过更多思考后,似乎问题“这个对象是否已经在这个集合中”和“这些对象定义是否相同”,在这种情况下使用了根本不同的平等概念。第一个是询问身份,如果进行深度比较,我希望包含 ID,而第二个是询问相似性,我不希望包含 ID。这让我更加反对实施 equals 方法。
你们同意这个决定吗?还是认为实施平等是更好的方法?
I am writing unit tests for objects that are cloned, serialized, and/or written to an XML file. In all three cases I would like to verify that the resulting object is the "same" as the original one. I have gone through several iterations in my approach and having found fault with all of them, was wondering what other people did.
My first idea was to manually implement the equals method in all the classes, and use assertEquals. I abandoned this this approach after deciding that overriding equals to perform a deep compare on mutable objects is a bad thing, as you almost always want collections to use reference equality for mutable objects they contain[1].
Then I figured I could just rename the method to contentEquals or something. However, after thinking more, I realized this wouldn't help me find the sort of regressions I was looking for. If a programmer adds a new (mutable) field, and forgets to add it to the clone method, then he will probably forget to add it to the contentEquals method too, and all these regression tests I'm writing will be worthless.
I then wrote a nifty assertContentEquals function that uses reflection to check the value of all the (non-transient) members of an object, recursively if necessary. This avoids the problems with the manual compare method above since it assumes by default that all fields must be preserved and the programmer must explicitly declare fields to skip. However, there are legitimate cases when a field really shouldn't be the same after cloning[2]. I put in an extra parameter toassertContentEquals that lists which fields to ignore, but since this list is declared in the unit test, it gets real ugly real fast in the case of recursive checking.
So I am now thinking of moving back to including a contentEquals method in each class being tested, but this time implemented using a helper function similar to the assertContentsEquals described above. This way when operating recursively, the exemptions will be defined in each individual class.
Any comments? How have you approached this issue in the past?
Edited to expound on my thoughts:
[1]I got the rational for not overriding equals on mutable classes from this article. Once you stick a mutable object in a Set/Map, if a field changes then its hash will change but its bucket will not, breaking things. So the options are to not override equals/getHash on mutable objects or have a policy of never changing a mutable object once it has been put into a collection.
I didn't mention that I am implementing these regression test on an existing codebase. In this context, the idea of changing the definition of equals, and then having to find all instances where it could change the behavior of the software frightens to me. I feel like I could easily break more than I fix.
[2]One example in our code base is a graph structure, where each node needs a unique identifier to use to link the nodes XML when eventually written to XML. When we clone these objects we want the identifier to be different, but everything else to remain the same. After ruminating about it more, it seems like the questions "is this object already in this collection" and "are these objects defined the same", use fundamentally different concepts of equality in this context. The first is asking about identity and I would want the ID included if doing a deep compare, while the second is asking about similarity and I don't want the ID included. This is making me lean more against implementing the equals method.
Do you guys agree with this decision, or do you think that implementing equals is the better way to go?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我将采用反射方法并使用 RetentionPolicy.RUNTIME 定义自定义注释,以允许测试类的实现者标记克隆后预计会更改的字段。然后,您可以通过反射检查注释并跳过标记的字段。
这样,您可以保持测试代码通用且简单,并有一种方便的方法直接在代码中标记异常,而不会影响需要测试的代码的设计或运行时行为。
注释可能如下所示:
这就是它如何在要测试的代码中使用:
最后是测试代码的相关部分:
I would go with the reflection approach and define a custom Annotation with RetentionPolicy.RUNTIME to allow the implementers of the tested classes to mark the fields that are expected to change after cloning. You can then check the annotation with reflection and skip the marked fields.
This way you can keep your test code generic and simple and have a convenient means to mark exceptions directly in the code without affecting the design or runtime behavior of the code that needs to be tested.
The annotation could look like this:
This is how it can be used in the code that is to be tested:
And finally the relevant part of the test code:
AssertJ 提供了递归比较功能:
详细信息请参阅 AssertJ 文档:https://assertj.github .io/doc/#basic-usage
使用AssertJ的先决条件:
导入:
maven依赖:
AssertJ's offers a recursive comparison function:
See the AssertJ documentation for details: https://assertj.github.io/doc/#basic-usage
Prerequisites for using AssertJ:
import:
maven dependency: