散列图像二进制 - 使用多少字节数组?
我想对已转换为字节数组的图像进行哈希处理。过程越快越好,所以我想知道我真正需要将多少 300000 元素字节数组输入哈希函数(在本例中为 sha1)才能获得唯一的哈希字符串?有谁知道在图像二进制文件中,第一个 x 块是否都是元数据?我可以使用一个神奇的索引号来代替字节数组的全长吗?前任。只对第一个 [5000] 进行哈希处理。
I would like to hash images, which have been converted to byte arrays. The faster the process, the better, so I was wondering how much of a 300000 element byte array I really need to feed into the hash function (sha1 in this case) to get a unique hash string? Does anybody know if, in image binaries, the first x chunk is all meta data? Is there a magic index number that I can use instead of the full length of the byte array? Ex. only hash the first [5000].
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我的观点是,这是每个图像具有唯一散列的概率的问题。如果仅使用 300000 字节图像的前 5000 字节,则仅下部不同的两个图像可能具有相同的哈希值。这不一定是sha1的问题。
您还可以将 5000 字节平均分配到 300000 字节上。
My opinion is that this is a matter of the probability of having a unique hash for each image. If you use only the first 5000 bytes of a 300000 bytes image, then two images only differing in their lower part would probably have the same hash. This is not necessarily a matter of sha1.
You could also distribute your 5000 bytes evenly over the 300000 bytes.