同时执行多个备份作业:理论与实践

发布于 2024-10-01 09:54:00 字数 244 浏览 0 评论 0原文

在用 python 为 Linux 设计一个相当简单的备份系统时,我发现自己在问一个问题,同时备份多个数据集/档案是否有时间优势?

我的直觉告诉我,编写同时处理多个档案不会为我赢得太多时间,因为 I/O 已经是最大的瓶颈。

另一方面,如果使用像 bz2 这样的东西,多线程是否会有优势,因为对 CPU 的更高需求会减少 I/O 需求?或者这是一种洗礼,因为所有线程基本上都会做相同的事情,因此共享相同的瓶颈?

While designing a fairly simple backup system for Linux in python, I'm finding myself asking the question, could there be any time advantage to backing up up several datasets/archives simultaneously?

My intuition tells me that writing to several archives simultaneously would not buy me much time as I/O would already be the greatest bottleneck.

On the other hand, if using something like bz2, would there be an advantage with multi-threading since higher demand of CPU will decrease I/O demand? Or is it a wash since all threads would be doing essentially the same thing and therefore sharing the same bottlenecks?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

亽野灬性zι浪 2024-10-08 09:54:00

这取决于您的系统。如果您有多个磁盘,那么并行化备份作业可能非常值得。如果您有多个处理器,并行压缩多个作业可能值得您花时间。

如果处理器足够慢(并且磁盘足够快),以至于压缩使您的 CPU 成为瓶颈,那么您将在多核或超线程处理器上获得一些收益。如果您的 CPU 能够跟上驱动器的读取速度,则写入压缩数据所减少的 I/O 需求几乎肯定是一个胜利。

无论如何,这都非常依赖于系统。尝试一下看看。一次运行两个作业,然后串行运行相同的两个作业,看看哪个花费的时间更长。廉价(编码方面)的方法是使用不同的输入和输出参数运行备份脚本两次。一旦你确定了胜利者,你就可以在这条路上走得更远。

It depends on your system. If you have multiple disks, it could be very worthwhile to parallelize your backup job. If you have multiple processors, compressing multiple jobs in parallel may be worth your while.

If the processor is slow enough (and the disks are fast enough) that zipping makes your CPU a bottleneck, you'll make some gains on multicore or hyperthreaded processors. The reduced I/O demand from zipped data being written is almost certainly a win if your CPU can keep up with the read speed of your drive(s).

Anyway, this is all very system dependent. Try it and see. Run two jobs at once and then run the same two in serial and see which took longer. The cheap (coding-wise) way is to just run your backup script twice with different input and output parameters. Once you've established a winner, you can go farther down the path.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文