使用PHP-FPM来管理队列消费者可以吗?
有一个 beanstalkd 队列,其中每 10 分钟就会充满大量任务,并且尽快处理每个任务是首要任务。任务可能需要几毫秒才能完成,因为有对第三方服务的调用,这些服务往往会时不时地超时。
因此,由于 PHP 没有多线程,一种选择是创建大量空闲工作线程,它们会尝试保留任务,但它可能会占用太多 RAM,而这些内存可能无法在这些机器上使用。
使用PHP-FPM来调整worker数量并节省一些RAM是个好主意吗?生产就绪了吗?有更好的解决方案吗?
谢谢
There's a beanstalkd queue, which gets filled with a lot of tasks say every 10 minutes and it is top priority that each task is processed ASAP. Task can take more that a few milliseconds to complete for there are calls to third-party services, which tend to timeout every now and then.
So since PHP doesn't have multithreading, one option would be to create a lot of idle workers, which would try to reserve a task, but it is likely to take too much RAM, which may not be available on those boxes.
Is it a good idea to use PHP-FPM to adjust the number of workers and save some RAM? Is it production-ready? Are there better solutions?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我正在运行一个每天处理数百万条消息的队列系统。主要是通过 Amazon SQS,但我还运行一个新的 Beanstalkd 系统,其中现在有超过 600,000 条消息。
正如博文中所述关于这个主题,我在处理消息的循环中运行 shell 脚本(在返回之前在 PHP 脚本中运行多个作业的循环也有些用处,至少对于较小的作业而言)。
这些 shell 脚本由 Supervisord 启动。还有另一篇关于其使用的博客文章。我目前正在 9 台机器上运行 800 多个工作脚本(针对几种不同类型的作业),所有脚本都从不同的队列中提取数据并将数据放回到其他队列中,写入数据库或文件。增加每台机器的工作人员数量就是增加“numprocs”(或使其足够大),然后根据需要启动更多工作人员。您还可以设置 5 个自动启动,然后再设置 50 个准备根据需要启动的块。
我发现每个工作进程只占用大约 20mb 的非共享内存(其余的在进程之间是通用的)。当然,这确实取决于工人所做的任务。调整图像大小可能需要花费很多精力。部分出于这个原因,我设置了能够频繁重新启动 PHP 脚本的功能。
I'm running a queue system that is dealing with millions of messages per day. Mostly via Amazon SQS, but I'm also running a new Beanstalkd system with over 600,000 msgs in there right now.
As is described in a blogpost on the subject, I have shell scripts running in a loop processing messages (a loop within the PHP script to run multiple jobs before returning is also somewhat useful, at least for smaller jobs).
Those shell scripts are started with Supervisord. There's another blog post on the use of that as well. I'm currently running over 800 worker scripts (for a few different types of jobs) across nine machines, all pulling from various queues and putting data back into other queues, writing to the DB or files. Increasing the number of workers per machine is a matter of increasing the "numprocs" (or having it large enough already), and then starting more as required. You could also have say 5 auto-start, and then another block of 50 that are ready to start as required.
I find each worker only takes around 20mb of non-shared memory (the rest is common between processes). This does depend on the tasks that the workers do of course. Resizing images can take a lot of effort. It's partly for this reason I have setup to be able to frequently restart the PHP script.
每当我必须同时(或异步)运行东西时,我都会将作业分派给 gearman 工作人员。我通常每台物理机的每个 CPU 核心都至少有一个进程在运行。
PHP-FPM 是一个 cgi 守护进程。因此,您基本上可以让 beanstalkd 处理器向您自己的系统运行一堆 HTTP 请求。这些可能必须通过您的 http 堆栈。不确定这是否是一个好主意。
您还可以查看 pcntl_fork 将当前进程分叉为多个当前运行的进程。
Whenever I had to run stuff concurrently (or asynchronously), I dispatched the jobs to gearman workers. I usually had at least one process per CPU core per physical machine running.
PHP-FPM is a cgi daemon. So you'd basically have your beanstalkd-processor run a bunch of HTTP requests to your own system. Those would probably have to go through your http stack. Not sure if that is such a great idea.
You could also check out pcntl_fork to fork your current process into multiple cuncurrently running processes.