适用于大数据量的内存高效哈希算法
我在 Windows Mobile 5 中使用 C#。该程序从互联网下载大文件及其哈希值。然后,它计算下载数据的哈希值并将其与预期哈希值进行比较。其目的是验证整个文件下载时是否未损坏。
问题是该文件足够大,如果我将文件的全部内容以字节数组的形式放入内存中,那么设备将耗尽内存。不过我想这样做,以便我可以计算字节的哈希值。是否可以在不一次将所有字节存储在内存中的情况下计算哈希值?最好我想使用 SHA1Managed 类来计算 SHA1 哈希值,但如果有必要,我愿意更改它。我确实注意到接受 Stream 的 SHA1Managed.ComputeHash() 方法存在过载,但我不知道它使用的内存是否比仅将所有字节拉入内存和我所知道的内存分析器要少。 NET CF完全没用。
I am using C# in Windows Mobile 5. The program downloads large files from the internet along with their hash values. Then it computes a hash value of the downloaded data and compares it to the expected hash value. The idea is to verify that the entire file was downloaded uncorrupted.
The problem is that the file is large enough to where if I put the entire contents of the file into memory in a byte array, then the device will run out of memory. However I want to do this so that I can compute the hash of the bytes. Is it possible to compute the hash without having all the bytes in memory at once? Preferably I would like to compute SHA1 hashes using the SHA1Managed class, but I am willing to change that if necessary. I did notice that there is an overload of the SHA1Managed.ComputeHash() method that accepts a Stream, but I don't know if it uses any less memory than just pulling all the bytes into memory and the memory profilers I know of for .NET CF are completely useless.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
假设您在计算哈希值后丢弃流的内容,
SHA1Managed.ComputeHash(Stream)
应该具有更高的内存效率。您将使用多少内存将部分取决于Stream
实现。SHA1Managed.ComputeHash(Stream)
should be more memory efficient assuming that you're discarding the contents of the stream after you compute your hash value. How much memory you will use will be based in part on theStream
implementation.以下是在桌面上的完成方式:
计算来自 C# 中未知长度的流的哈希
通过使用大于可用内存量的输入源来测试流实现是否提取整个文件应该很容易。
Here is how it's done on the desktop:
Compute a hash from a stream of unknown length in C#
It should be pretty easy to test whether the stream implementation pulls in the entire file or not by using an input source larger than the amount of memory available.