使用 PHP 进行异步处理 - 每个作业一名工人

发布于 2024-09-14 20:54:03 字数 519 浏览 4 评论 0原文

考虑一个 PHP Web 应用程序,其目的是接受用户请求来启动通用异步作业,然后创建一个工作进程/线程来运行该作业。这些作业并不是特别密集的 CPU 或内存,但预计会经常阻塞 I/O 调用。每秒启动的作业不应超过一或两个,但由于运行时间较长,可能会同时运行许多作业。

因此,并行运行这些作业至关重要。此外,每个作业都必须由负责杀死挂起的工作程序、根据用户请求中止工作程序等的管理器守护程序进行监控。

实现这样的系统的最佳方法是什么?我可以看到:

  1. 从经理那里分叉一个工人——这似乎是最低级别的选择,我必须自己实现一个监控系统。 Apache 是 Web 服务器,因此该选项似乎需要通过 FastCGI 启动任何 PHP 工作线程。
  2. 使用某种作业/消息队列。 (gearman、beanstalkd、RabbitMQ 等) - 最初,这似乎是显而易见的选择。经过一番研究后,我对所有选项都有些困惑。例如,Gearman 看起来像是为大型分布式系统设计的,其中有固定的工作人员池……所以我不知道它是否适合我的需要(每个工作一个工作人员)。

Consider a PHP web application whose purpose is to accept user requests to start generic asynchronous jobs, and then create a worker process/thread to run the job. The jobs are not particularly CPU or memory intensive, but are expected to block on I/O calls fairly often. No more than one or two jobs should be started per second, but due to the long run times there may be many jobs running at once.

Therefore, it's of utmost importance that the jobs run in parallel. Also, each job must be monitored by a manager daemon responsible for killing hung workers, aborting workers on user request, etc.

What is the best way to go about implementing a system such as this? I can see:

  1. Forking a worker from the manager - this appears to be the lowest-level option, and I would have to implement a monitoring system myself. Apache is the web server, so it appears that this option would require any PHP workers to be started via FastCGI.
  2. Use some sort of job/message queue. (gearman, beanstalkd, RabbitMQ, etc.) - Initially, this seemed like the obvious choice. After some research, I'm somewhat confused with all of the options. For instance, Gearman looks like it's designed for huge distributed systems where there is a fixed pool of workers...so I don't know if it's right for what I need (one worker per job).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

握住我的手 2024-09-21 20:54:03

好吧,如果您使用的是 Linux,则可以使用 pcntl_fork 来分叉子项。然后“主人”看着孩子们。每个孩子都完成了自己的任务,然后正常存在。

就我个人而言,在我的实现中我从来不需要消息队列。我只是在“master”中使用了带有锁的数组。当一个孩子找到工作时,它会写一个带有工作 ID 号的锁定文件。然后主人就会等到那个孩子出去。如果孩子退出后锁定文件仍然存在,那么我知道任务尚未完成,并使用相同的作业重新启动孩子(删除锁定文件后)。根据您的情况,您可以在简单的数据库表中实现队列。在表中插入作业,并每 30 或 60 秒检查主表中的新作业。然后,只有在子进程完成后才将它们从表中删除(并且子进程删除了锁定文件)。如果你同时运行多个“master”,这就会出现问题,但你可以实现一个全局“master pid 文件”来检测和防止多个实例......

而且我不建议使用 FastCGI 进行分叉。它可能会导致一些非常模糊的问题,因为环境应该持续存在。相反,如果您必须拥有 Web 界面,请使用 CGI,但最好使用 CLI 应用程序(守护程序)。要与其他进程的主进程进行交互,您可以使用套接字进行 TCP 通信,或者创建一个 用于通信的 FIFO 文件

至于检测挂起的工作线程,您可以实现一个“心跳”系统,其中子进程每隔几秒向主进程发出一个 SIG_USR1 。那么如果那段时间你有两三次没有收到孩子的消息,它可能就被挂了。但问题是,由于 PHP 不是多线程的,因此您无法判断子进程是否挂起,或者它是否只是在等待阻塞资源(例如数据库调用)...至于实现“心跳” ,您可以使用 tick 函数 来自动化心跳(但请记住,阻塞调用仍然不会执行)...

Well, if you're on Linux, you can use pcntl_fork to fork children off. The "master" then watches the children. Each child completes its task and then exists normally.

Personally, in my implementations I've never needed a message queue. I simply used an array in the "master" with locks. When a child got a job, it would write a lock file with the job id number. The master would then wait until that child exited. If the lock file still exists after the child exited, then I know the task wasn't completed, and re-launch a child with the same job (after removing the lock file). Depending on your situation, you could implement the queue in a simple database table. Insert jobs in the table, and check the table in the master every 30 or 60 seconds for new jobs. Then only delete them from the table once the child is finished (and the child removed the lock file). This would have issues if you had more than one "master" running at a time, but you could implement a global "master pid file" to detect and prevent multiple instances...

And I would not suggest forking with FastCGI. It can result in some very obscure problems since the environment is meant to persist. Instead, use CGI if you must have it web interface, but ideally use a CLI app (a deamon). To interface with the master from other processes, you can either use sockets for TCP communication, or create a FIFO file for communication.

As for detecting hung workers, you could implement a "heart-beat" system, where the child issues a SIG_USR1 to the master process every so many seconds. Then if you haven't heard from the child in two or three times that time, it may be hung. But the thing is since PHP isn't multi-threaded, you can't tell if a child is hung or if it's just waiting on a blocking resource (like a database call)... As for implementing the "heart-beat", you could use a tick function to automate the heart-beat (but keep in mind, blocking calls still won't execute)...

三五鸿雁 2024-09-21 20:54:03

当您确实使用 pcntl_fork 运行具有多个作业的异步一项任务时,或者您将每隔一秒创建一次持久性查询,请小心高 CPU 消耗,您可能会挂起处理内存,因为无法再次分配内存,我认为最好的选择您可以使用 Gearman 进行完整构建,也可以尝试使用 IronWorker 等云工作人员。

while you do run asynchronous one task with many job with pcntl_fork or you will creating persistence query every (s)seconds, be carefully with high cpu consumption, you can get hanging processing memory because can't be allocated memory again, i think best choice you can build fully with Gearman, or you can try with cloud worker such as IronWorker.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文