对文件进行分区以进行并行下载

发布于 2024-12-29 16:38:55 字数 221 浏览 0 评论 0原文

我想制作一个多线程下载器(用Python),我需要告诉每个线程从哪里开始以及下载多少字节。为此,我获取远程文件大小并将其除以(例如)2。现在,假设远程文件大小为 5:当我将该数字除以 2 时,得到结果 2。现在我可以开始下载,但我会丢失一个字节(因为 2*2=4,而不是 5)。我无法使用浮点数,因为我无法下载半个字节。例如,我如何除以该数字并获得带有 [2, 3] 的列表?

I want to make a multi-threaded downloader (in Python) and I need to tell each thread where to start and how many bytes to download. For that I get the remote file size and divide it, for example, to 2. Now, let's say that the remote file size is 5: when I divide the number to 2, I get 2 as result. Now I can start the download but I will lose a byte (because 2*2=4, not 5). I can't use float numbers because I can't download half of a byte. How I could divide that number and to get a list with [2, 3], for example?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜访吸血鬼 2025-01-05 16:38:55

使用 divmod

>>> divmod(5, 2)
(2, 1)
>>>

这告诉你,5 除以除以 2 为 2,余数为 1,因此最后一块将是 2 + 1 = 3。

>>> divmod(12345, 6)
(2057, 3)

在这里,您将在 2057 处有 5 个块,在 2057+3 处有最后一个切片。

该算法也适用于除法没有余数的情况:

>>> divmod(12345, 5)
(2469, 0)

在这里,您将在 2469 处有 4 个块,加上 2469+0 处的最后一个切片。

因此,您的块大小可以计算为:

def chunk_sizes(filesize, num_chunks):
    d, r = divmod(filesize, num_chunks)
    result = [d] * num_chunks
    result[-1] += r
    return result

Use divmod:

>>> divmod(5, 2)
(2, 1)
>>>

This tells you, that 5 divided by 2 is 2, remainder 1, so the last piece will be 2 + 1 = 3.

>>> divmod(12345, 6)
(2057, 3)

Here, you'll have 5 chunks at 2057 and a last slice at 2057+3.

This algorithm will also work for cases, where division is without remainder:

>>> divmod(12345, 5)
(2469, 0)

Here, you'll have 4 chunks at 2469 plus a last slice at 2469+0.

So, your chunk sizes could be computed as:

def chunk_sizes(filesize, num_chunks):
    d, r = divmod(filesize, num_chunks)
    result = [d] * num_chunks
    result[-1] += r
    return result
朕就是辣么酷 2025-01-05 16:38:55

如果你想获得每个块的大小,你可以简单地将除法的余数添加到最后一个元素:

>>> file_size = 11
>>> no_of_chunks = 3
>>> chunks = [file_size / no_of_chunks] * no_of_chunks
>>> chunks[-1] += file_size % no_of_chunks
>>> chunks
[3, 3, 5]

你也可以修改它以将余数分布到所有块中,以便块的大小最多偏差1 :

>>> for i in range(file_size % no_of_chunks):
>>>    chunks[i] += 1
>>> chunks
[4, 4, 3]

If you want to get the size of each chunk, you can simply add the remainder of the division to the last element:

>>> file_size = 11
>>> no_of_chunks = 3
>>> chunks = [file_size / no_of_chunks] * no_of_chunks
>>> chunks[-1] += file_size % no_of_chunks
>>> chunks
[3, 3, 5]

You can also modify that to distribute the remainder across all chunks, so that the size of the chunks deviates by at most 1:

>>> for i in range(file_size % no_of_chunks):
>>>    chunks[i] += 1
>>> chunks
[4, 4, 3]
哥,最终变帅啦 2025-01-05 16:38:55

最后一个线程的特殊情况——分配它以获取剩余的字节数。

Special case the last thread -- assign it to get however many bytes are left.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文