是否有机会在二进制序列化(二进制)期间保存对象的哈希码?
我希望能够通过哈希码比较对象。
每个示例,一个是对象本身,另一个是序列化(二进制)然后恢复的对象版本。
如何将哈希值保存在序列化(二进制)对象中?
I want to be able to compare objects by the hashcode.
Per example, one is the object itself, and the other is serialized (binary) and then recovered version of the object.
How can I save the hash in the serialized (binary) object?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
为什么必须序列化哈希码?相反,您应该在对象中提供
GetHashCode()
和Equals()
的正确实现,以便您可以根据两个对象的值来比较它们 - 如果两个对象的值相等哈希码必须匹配。因此,一旦反序列化该对象,您就可以对其使用GetHashCode()
并将其与其他对象进行比较。请注意,两个哈希码匹配这一事实不足以确定相等性,它们可能仍然不同 - 您必须调用Equals()
的正确实现来确定相等性。如果您只想比较对象内的自定义字段,并且完整比较可能太昂贵(即大型二进制数组),则生成 MD5 哈希可能是有意义的(即使用
MD5CryptoServiceProvider.ComputeHash()
) 放在字段上并存储在对象本身内,它将像任何其他对象属性一样被序列化。Why would you have to serialize the hash code? Instead you should provide a proper implementation of
GetHashCode()
andEquals()
in your object that allows you to compare two objects based on their values - if two objects are equal their hash codes have to match. So once you have deserialized the object, you can useGetHashCode()
on it and compare it with the other object. Note that the fact that two hash codes match is not enough to determine equality, they might still be different - you will have to call a proper implementation ofEquals()
to determine equality.If you just want to compare custom fields within an object and a full comparison might be too expensive (i.e. a large binary array) it might make sense to generate an MD5 hash (i.e. with
MD5CryptoServiceProvider.ComputeHash()
) on the field and store that within the object itself, it will then be serialized just like any other object property.小心点!
.Net 对象的默认 HashCode 通常在程序的运行时实例之间发生变化。
换句话说,如果您的程序将包含哈希码的对象
A
序列化到光盘,则程序将终止,稍后重新启动,并从以下位置反序列化对象A
:光盘(或在运行时创建相同的对象A
),它将具有与存储的不同哈希码。部分原因是默认哈希码来自对象的垃圾收集器信息。在新的程序实例中,GC 将具有不同的信息,因此具有不同的哈希码。
如果您编写自己的
GetHashCode
,则可以生成跨进程一致的哈希码。但这里有一个陷阱你需要注意。Be wary!
The default HashCode of a .Net object often changes between run-time instances of a program.
In other words, if your program serializes object
A
, complete with hashcode, to the disc, then the program terminates, and is later restarted, and de-serializes objectA
from disc, (or creates an identical objectA
at run-time), it will have a different hashcode than what was stored.This is in part because the default hashcode comes from the Garbage Collectors information on an object. In a new program instance, the GC will have different information, and thus a different hashcode.
If you write your own
GetHashCode
, you can make a hashcode that is consistent across processes. But there is a pitfall here you need to be aware of.是否有任何信息可以用来判断哪些对象是从哪些原始对象序列化和反序列化的?如果是这样,那么您可以重写 GetHashCode() 以根据该信息计算哈希码。
如果没有,您也许可以通过为每个新创建的对象分配一个 UUID 来综合生成一个。将该值与其他数据一起序列化,以便重建的对象具有相同的 UUID。然后,您可以简单地重写 GetHashCode() 以返回该 UUID 的哈希代码。 (如果您正在寻找的是一种引用相等的修改版本,那么这应该可以完成工作。)
Is there any information which you can use to tell which objects were serialized and deserialized from which originals? If so, then you can override GetHashCode() to calculate a hash code based on that information.
If not, you might be able to generate one synthetically by assigning a UUID to each newly-created object. Serialize that value along with the other data so the reconstructed objects have the same UUID. You can then simply override GetHashCode() to return that UUID's hash code. (That should do the job if what you're looking for is a sort of modified version of referential equality.)