Hashlib Python 模块方法更新中的最大字节数限制

发布于 2024-10-16 23:17:42 字数 396 浏览 11 评论 0原文

我正在尝试使用 hashlib 模块中的函数 hashlib.md5() 计算文件的 md5 哈希值。

所以我写了这段代码：

Buffer = 128
f = open("c:\\file.tct", "rb")
m = hashlib.md5()

while True:
   p = f.read(Buffer)
   if len(p) != 0:
      m.update(p)
   else:
      break
print m.hexdigest()
f.close()

我注意到如果我将 Buffer 变量值增加到 64、128、256 等，函数更新会更快。有一个我不能超过的上限吗？我想这可能只是 RAM 内存问题，但我不知道。

原文

I am trying to compute md5 hash of a file with the function hashlib.md5() from hashlib module.

So that I writed this piece of code:

Buffer = 128
f = open("c:\\file.tct", "rb")
m = hashlib.md5()

while True:
   p = f.read(Buffer)
   if len(p) != 0:
      m.update(p)
   else:
      break
print m.hexdigest()
f.close()

I noted the function update is faster if I increase Buffer variable value with 64, 128, 256 and so on.
There is a upper limit I cannot exceed? I suppose it might only a RAM memory problem but I don't know.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

樱娆 2024-10-23 23:17:42

大（≈2**40）块大小会导致MemoryError，即除了可用 RAM 之外没有任何限制。另一方面，在我的机器上，bufsize 受到 2**31-1 的限制：

import hashlib
from functools import partial

def md5(filename, chunksize=2**15, bufsize=-1):
    m = hashlib.md5()
    with open(filename, 'rb', bufsize) as f:
        for chunk in iter(partial(f.read, chunksize), b''):
            m.update(chunk)
    return m

大的 chunksize 可能和非常小的块一样慢。测量一下。

我发现对于 ≈10MB 文件，2**15 chunksize 是我测试过的文件中最快的。

Big (≈2**40) chunk sizes lead to MemoryError i.e., there is no limit other than available RAM. On the other hand bufsize is limited by 2**31-1 on my machine:

import hashlib
from functools import partial

def md5(filename, chunksize=2**15, bufsize=-1):
    m = hashlib.md5()
    with open(filename, 'rb', bufsize) as f:
        for chunk in iter(partial(f.read, chunksize), b''):
            m.update(chunk)
    return m

Big chunksize can be as slow as a very small one. Measure it.

I find that for ≈10MB files the 2**15 chunksize is the fastest for the files I've tested.

回复收藏 0 原文