与CPU核心的数量相比,跑步更多的PHP-FPM儿童是否有意义?
假设我有一个带有4个内核和4个线程的CPU,通过设置pm.max_children = 8
选项,运行8 php-fpm工人是否有意义?就我而言,带有4个线程的CPU只能在“真实”并行中运行4个过程。如果CPU时间因上下文在这8个过程之间的切换而丢失,是否会导致开销?
相比之下,Node.js群集模式文档建议将尽可能多的工人/儿童作为核心数量运行。在这里不适用同样的建议吗?
Assuming that I have a CPU with 4 cores and 4 threads, does it make sense to run e.g. 8 PHP-FPM workers by setting pm.max_children = 8
option? As far as I'm concerned, CPU with 4 threads can only run up to 4 processes in "real" parallel. Wouldn't it cause an overhead if CPU time was lost due to contexts switching between these 8 processes?
In contrast, Node.js cluster mode documentation recommends to run up to as many workers/children as number of cores. Doesn't the same recommendation apply here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,这是有道理的,您可能应该总是这样做,让我解释原因。
PHP不使用螺纹并在单个核心上运行。 PHP-FPM产生了许多工人,因此您可以在多个内核上运行过程。
重要的是要了解OS如何使用过程上下文切换来同时处理多个过程。如果您只有一个核心,您仍然可以同时在计算机上运行多个进程,其原因是过程上下文切换。这意味着操作系统将使用单个核心并在过程之间进行切换,从而根据各种因素处理每个过程,例如,如果过程正在等待某些I/O,则该过程已经运行了多长时间,则如果另一个过程作为更高的优先级等。重要的部分是过程上下文切换需要一些时间,并且单核在多个过程之间共享。
如果您有多个核心,则可以在每个核心上并行执行过程,但是,您的运行过程很可能比内核更大,因此流程上下文切换仍然以较低的速率发生。
建议将
pm.max_children
设置为高于CPU内核的值的原因是,在大多数情况下,您的PHP过程不执行密集的CPU任务,但主要等待I/O,例如I/O在等待SQL结果时,等待一些卷曲响应或一些磁盘读写响应。这些操作称为I/O阻塞,通常是大部分时间在请求中消耗的东西。通过将pm.max_children
设置为比内核更高的值(有时甚至是核心量的5-10倍),您可以从切换操作系统的上下文中受益于该过程中的操作/闲置状态。只需等待IO即可,很可能有20多个PHP流程运行。如果您将
pm.max_children
设置为核心数,例如8,核心可能不会做太多,并且很多请求都会在加油,并且响应率将非常慢。如果您确定您的PHP流程没有阻塞I/O并且仅执行一些计算,例如,您实际上可能只能从完全设置pm.max_children中受益更多。max_children作为内核下降并拥有更多的运行流程消耗更多的资源。但是,这种情况是不寻常的,很可能您的流程确实具有I/O的阻塞和空闲时间。
有一篇很好的文章深入涉及过程上下文开关“在这里。
在 swoolee php extension)中,还有一些称为coroutines。 Coroutines还使用上下文切换来执行并发性,但是这是通过编程性完成的,它消耗的资源少得多,并且比OS上下文切换快得多。如果使用Swoole,则不需要PHP-FPM,因为它的速度更快,但是您需要关心其他问题。但是,借助Swoole,建议您设置与核心一样多的工人,以避免操作系统上下文切换。您可以拥有数千个Coroutines,而不会影响性能。
Nodejs使用的事件循环类似于Swoole的Coroutines。建议将工人设置为匹配您的核心的原因是避免使用操作系统上下文切换,并使用上下文切换中的内置切换,因为它更快,更轻。
Yes it makes sense and you should probably always do so, let me explain why.
PHP does not use threading and runs on a single core. PHP-FPM spawns up many workers so you can run your processes on multiple cores.
It's important to understand how the OS uses process context switching to handle multiple processes concurrently. If you only have one core you are still able to run multiple processes on your machine at the same time, the reason for this is process context switching. This means the OS will use the single core and switch between process on the fly, processing every process a bit at a time depending on various factors, such if a process is waiting for some I/O, how long the process has been running, if another process as a higher priority, etc. The important part is process context switching takes some time, and the single core is shared among multiple processes.
If you have multiple cores, processes can be executed in parallel on every core, however most likely you still have more running processes than cores, so process context switching still happens, just at a lower rate.
The reason it's recommend to set the
pm.max_children
to a value higher than your CPU cores, is that in most cases your php process is not doing intensive CPU tasks, but mostly waiting for I/O, such as waiting for a SQL result, waiting for some curl response or some disk read write response. These operations are called I/O blocking and is usually what consumes most of the time in a request. By setting thepm.max_children
to a higher value than cores (sometimes even 5-10 times the amount of cores), you can benefit from the context switching your OS will do while the process is in a blocking/idle state.It's very possible to have over 20 PHP processes running just waiting for IO. If you were to set the
pm.max_children
to the numbers of cores, let's say 8, the cores might not be doing much and a lot of requests would be pilling up and response rate would be very slow.If you are certain that your php processes have no blocking I/O and are only performing some calculations for example, you might actually benefit more from only setting exactly as many pm.max_children as your cores, the reason being that process context switching slows things down and having more running processes uses up more resources. However this scenario is unusual and most likely your processes do have I/O blocking and idle time.
There is a good article that goes in depth about process context switching on Linux here.
There is also something called coroutines that is used in the swoole PHP extension. Coroutines also use context switching to perform concurrency, however this is done programmatically which consumes much fewer resources and is much faster than OS context switching. If using swoole, there is no need for php-fpm since it's faster, but it has its other issues you need to care about. With swoole however it's recommend you set as many workers as cores to avoid OS context switching. You can have thousands of coroutines without affecting performance much.
Nodejs uses an event loop which is similar to swoole's coroutines. The reason it's recommend to set workers to match your cores is to avoid OS context switching and use the built in context switching since it's much faster and lighter.
一般的答案是肯定的,因为尽管您无法在 Parallel 中运行那么多线程,但您可以同时运行它们。
要理解的关键是,在大多数真实应用程序中,很多时间处理请求都不会使用本地CPU花费 - 它花在等待数据库查询,外部API,甚至磁盘访问。如果您每个CPU核心有一个线程,则CPU只是始终闲置。允许其他线程,一个可以使用CPU,而另一个则在等待外部数据。
只有当您的应用程序高度不寻常时,并且使用CPU花费100%的时间才能将每个核心的线程限制为一个线程。
这不适用于node.js的原因是,它使用 asynchronous 代码在单个线程中实现并发:您可以告诉当前线程“开始执行此操作,并且在等待结果时,请继续处理其他请求”。使用“无共享”方法的天然PHP无法做到这一点 - 每个请求都会有自己的线程或过程 - 但是有一些项目,例如swoole和amp,可以增加对这种异步方法的支持。
The general answer is yes, because although you can't run that many threads in parallel you can run them concurrently.
The key thing to understand is that in most real applications, a lot of the time spent processing a request is not spent using the local CPU - it's spent waiting for database queries, external APIs, even disk access. If you have one thread per CPU core, the CPU is simply sitting idle all that time. Allow additional threads, and one can be using the CPU while another is waiting for external data.
Only if your application is highly unusual and spending 100% of its time using the CPU would limiting to one thread per core make sense.
The reason this doesn't apply to node.js is that it implements concurrency within a single thread using asynchronous code: you can tell the current thread "start doing this, and while waiting for the result, get on with processing a different request". This isn't possible with native PHP, which uses a "shared nothing" approach - each request gets its own thread or process - but there are projects such as Swoole and Amp that add support for this asynchronous approach.