基于代码大小的并行化的成本/收益?
如何根据代码大小判断特定代码块是否值得并行化? 下面的计算正确吗?
假设:
- 线程池由每个 CPU 一个线程组成。
- 执行时间为 X 毫秒的 CPU 密集型代码块。
Y = min(CPU 数量,并发请求数量)
因此:
- 成本:代码复杂性、潜在错误
- 收益:
(X * Y)
我的结论
是,对于 X 或 Y 的小值来说,不值得并行化,其中“小”取决于您的请求的响应速度。
How do you figure out whether it's worth parallelizing a particular code block based on its code size? Is the following calculation correct?
Assume:
- Thread pool consisting of one thread per CPU.
- CPU-bound code block with execution time of X milliseconds.
Y = min(number of CPUs, number of concurrent requests)
Therefore:
- Cost: code complexity, potential bugs
- Benefit:
(X * Y)
milliseconds
My conclusion is that it isn't worth parallelizing for small values of X or Y, where "small" depends on how responsive your requests must be.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
可以帮助您弄清楚这一点的一件事是阿姆达尔定律
弄清楚您想要在加速方面实现什么,以及您实际上可以实现多少并行性,然后看看它是否值得。
One thing that will help you figure that out is Amdahl's Law
Figure out what you want to achieve in speed up, and how much parallelism you can actually achieve, then see if its worth it.
这取决于许多因素,例如并行化代码的难度、从中获得的加速(划分问题和连接结果会产生开销成本)以及代码在那里花费的时间(阿姆达尔定律)
It depends on many factors, as the difficulty of parallelize the code, the speedup obtained from it (there are overhead costs on dividing the problem and joining the results) and the amount of time that the code is spending there (Amdahl's Law)
嗯,好处确实更多:
(X * (Y-1)) * Tc * Pf
其中 Tc 是您正在使用的线程框架的成本。 没有任何线程框架能够完美扩展,因此使用 2 倍线程最多只能实现 1.9 倍的速度。
Pf 是并行化的一些因素,完全取决于算法(即:是否需要锁定,这会减慢进程)。
另外,它是 Y-1,因为单线程基本上假设 Y==1。
至于决定,这也是一个用户沮丧/期望的问题(如果用户对等待某些事情感到恼火,那么它比用户并不真正介意的任务有更大的好处 - 这并不总是仅仅因为等待时间等 - 这部分是期望)。
Well, the benefit is really more:
(X * (Y-1)) * Tc * Pf
Where Tc is the cost of the threading framework you are using. No threading framework scales perfectly, so using 2x threads will likely be, at best, 1.9x speed.
Pf is some factor for parallization that depends completely on the algorithm (ie: whether or not you'll need to lock, which will slow the process down).
Also, it's Y-1, since single threaded is basically assuming Y==1.
As for deciding, it's also a matter of user frustration/expectation (if they user is annoyed at waiting for something, it'd have a greater benefit than a task that the user doesn't really mind - which is not always just due to wait times, etc - it's partly expectations).