选择多处理进程数时是否有任何指导原则可供遵循?
我刚刚开始接触多重处理(这真是太棒了!),但我想知道是否有任何选择进程数量的指南?它仅基于服务器上的核心数量吗?它是否以某种方式基于您运行的应用程序(循环数、使用的 cpu 数量等)?等等...我如何决定生成多少个进程?现在,我只是猜测并添加/删除流程,但如果有某种指南或最佳实践,那就太好了。
另一个问题,我知道如果我添加太少(程序很慢)会发生什么,但是如果我添加“太多”怎么办?
I'm just getting my feet wet with multiprocessing (and it's totally awesome!), but I was wondering if there was any guidelines to selecting number of processes? Is it just based on number of cores on the server? Is it somehow based on the application your running (number of loops, how much cpu it uses, etc)? etc...how do I decide how many processes to spawn? Right now, I'm just guessing and add/removing processes but it would be great if there was some kind of guideline or best practice.
Another question, I know what happens if I add too few (program is slow) but what if I add 'too many'?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您的所有线程/进程确实受 CPU 限制,则您应该运行与 CPU 报告核心数一样多的进程。由于超线程,每个物理CPU核心可以呈现多个虚拟核心。调用
multiprocessing.cpu_count
获取数字虚拟核心数。如果您的 1 个线程中只有 p 受 CPU 限制,您可以通过乘以 p 来调整该数字。例如,如果您的一半进程受 CPU 限制 (p = 0.5),并且您有两个 CPU,每个 CPU 4 核和 2 个超线程,则您应该启动 0.5 * 2 * 4 * 2 = 8 个进程。
如果进程太少,应用程序的运行速度将比预期慢。如果您的应用程序可以完美扩展并且仅受 CPU 限制(即在 10 倍数量的内核上执行时速度提高 10 倍),这意味着您的速度相对较慢。例如,如果您的系统调用 8 个进程,但您只启动了 4 个进程,那么您将只使用一半的处理能力,并且需要两倍的时间。请注意,在实践中,没有任何应用程序可以完美扩展,但有些应用程序(光线追踪、视频编码)非常接近。
如果进程太多,同步开销将会增加。如果您的程序几乎没有同步开销,这不会影响整体运行时间,但可能会使其他程序显得比实际速度慢,除非您将进程设置为较低的优先级。如果您的操作系统具有良好的调度程序,理论上,过多的进程(例如 10000)是可以的。实际上,几乎任何同步都会导致开销难以承受。
如果您不确定您的应用程序是否受 CPU 限制和/或完美扩展,只需观察不同线程数的系统负载即可。您希望系统负载略低于 100%,或者更精确的正常运行时间虚拟核心的数量。
If all of your threads/processes are indeed CPU-bound, you should run as many processes as the CPU reports cores. Due to HyperThreading, each physical CPU cores may be able to present multiple virtual cores. Call
multiprocessing.cpu_count
to get the number of virtual cores.If only p of 1 of your threads is CPU-bound, you can adjust that number by multiplying by p. For example, if half your processes are CPU-bound (p = 0.5) and you have two CPUs with 4 cores each and 2x HyperThreading, you should start 0.5 * 2 * 4 * 2 = 8 processes.
If you have too few process, your application will run slower than expected. If your application scales perfectly and is only CPU-bound (i.e. is 10 times faster when executed on 10 times the amount of cores), this means you the speed is slower in relation. For example, if your system calls for 8 processes, but you only initiate 4, you'll only use half of the processing capacity and take twice as long. Note that in practice, no application scales perfectly, but some (ray tracing, video encoding) are pretty close.
If you have too many processes, the synchronization overhead will increase. If your program is little to none synchronization overhead, this won't impact the overall runtime, but may make other programs appear slower than they are unless you set your processes to a lower priority. Excessive numbers of processes (say, 10000) are fine in theory if your OS has a good scheduler. In practice, virtually any synchronization will make the overhead unbearable.
If you're not sure whether your application is CPU-bound and/or perfectly scaling, simply observe system load with different thread counts. You want the system load to be slightly under 100%, or the more precise uptime to be the number of virtual cores.
这绝对基于应用程序的功能。如果它的 CPU 较多,那么核心数是一个合理的起点。如果它是 IO 密集型的,那么多个进程无论如何都不会提高性能。如果主要是CPU,偶尔有IO(例如PNG优化),您可以运行比核心数量更多的进程。
唯一确定的方法是使用一些实际输入运行应用程序并检查资源利用率。如果您有空闲的 CPU 时间,请添加更多工作进程。
It's definitely based on what the application does. If it's CPU-heavy, the number of cores is a sane starting point. If it's IO-heavy, mulitple processes won't help performance anyway. If it's mostly CPU with occasional IO (e.g. PNG optimisation), you can run a few processes more than the number of cores.
The only way to know for certain is to run your application with some realistic input and check the resource utilisation. If you have CPU time to spare, add more worker processes.