对文件进行分区以进行并行下载
我想制作一个多线程下载器(用Python),我需要告诉每个线程从哪里开始以及下载多少字节。为此,我获取远程文件大小并将其除以(例如)2。现在,假设远程文件大小为 5:当我将该数字除以 2 时,得到结果 2。现在我可以开始下载,但我会丢失一个字节(因为 2*2=4
,而不是 5)。我无法使用浮点数,因为我无法下载半个字节。例如,我如何除以该数字并获得带有 [2, 3]
的列表?
I want to make a multi-threaded downloader (in Python) and I need to tell each thread where to start and how many bytes to download. For that I get the remote file size and divide it, for example, to 2. Now, let's say that the remote file size is 5: when I divide the number to 2, I get 2 as result. Now I can start the download but I will lose a byte (because 2*2=4
, not 5). I can't use float numbers because I can't download half of a byte. How I could divide that number and to get a list with [2, 3]
, for example?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用
divmod
:这告诉你,5 除以除以 2 为 2,余数为 1,因此最后一块将是 2 + 1 = 3。
在这里,您将在 2057 处有 5 个块,在 2057+3 处有最后一个切片。
该算法也适用于除法没有余数的情况:
在这里,您将在 2469 处有 4 个块,加上 2469+0 处的最后一个切片。
因此,您的块大小可以计算为:
Use
divmod
:This tells you, that 5 divided by 2 is 2, remainder 1, so the last piece will be 2 + 1 = 3.
Here, you'll have 5 chunks at 2057 and a last slice at 2057+3.
This algorithm will also work for cases, where division is without remainder:
Here, you'll have 4 chunks at 2469 plus a last slice at 2469+0.
So, your chunk sizes could be computed as:
如果你想获得每个块的大小,你可以简单地将除法的余数添加到最后一个元素:
你也可以修改它以将余数分布到所有块中,以便块的大小最多偏差1 :
If you want to get the size of each chunk, you can simply add the remainder of the division to the last element:
You can also modify that to distribute the remainder across all chunks, so that the size of the chunks deviates by at most 1:
最后一个线程的特殊情况——分配它以获取剩余的字节数。
Special case the last thread -- assign it to get however many bytes are left.