使用JAVA的单实例文件存储
我根据校验和存储了一些文件,但我发现了一个缺陷,即两个校验和有时可能相同。
我总是尝试寻找 API,而不是重新发明轮子,但我找不到任何东西。
我知道 JSR 268 和 JackRabbit 作为内容存储的标准,但我的应用程序已经使用了这些东西很多年了。
那么,是否有使用 Java 进行单实例文件存储的方法,或者我应该继续寻找新的校验和算法?
编辑:
当 numcheck 不起作用时:2 个文件完全相同,只是位于不同的文件系统位置。然而,当从客户端发送时,服务器端不可能知道它们之前的路径,因此它是同一个文件两次,校验和相同。
如果你想检索其中任何一个,你如何检查?
想知道是否有标准方法、API 或算法可以帮助我发现差异
I was storing some files based on a checksum but I found a flaw that 2 checksums can be identical sometimes.
I always try looking for API instead of reinventing the wheel, but I can't find anything.
I know theres the JSR 268 and JackRabbit as a standard for content storage but my app is light-years of using such thing.
So, are there approaches for single Instance File Storage with Java or should I just keep searching for new algorithms for my checksum?
EDIT:
When numcheck is not working: 2 files are exactly the same, just in different file system locations. However when sent from the client is impossible on server side to know the path they were before, so it is the same file twice, same checksum.
If you wanna retrieve either one, how you check that?
Wanted to know if there was an standard approach, API, or an algorithm that could help me spot the difference
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
无论哈希算法有多强大,总是有可能发生冲突。哈希算法从无限数量的输入生成有限数量的哈希值。
No matter how strong a hashing algorithm is, there is always a chance of a collision. A hashing algorithm generates a finite number of hashes from an infinite number of inputs.
确保两个文件不相同的唯一方法是逐位比较它们。对它们进行哈希处理更容易、更快,但也存在冲突的风险。
The only way to ensure that two files are not identical is to compare them bit by bit. Hashing them is easier and faster, but carries with it the risk of collision.