将工作服务器与 php 应用程序一起使用的指南
我已经构建了一个 PHP 应用程序,并且我读到,在调用 api 或执行耗时的操作时,使用“worker”+队列服务器是最佳实践。
快速搜索教程却一无所获。我已经使用 codeigniter 构建了我的应用程序,并且我确实对 facebook api 进行了各种调用 + 在整个应用程序中使用基于 php 的图像操作。我唯一想知道的是,如果我正在执行 api 调用或调整图像大小,并且用户在完成之前通常不会关心从我的服务器返回响应,那么队列服务器+工作人员如何帮助我。
哪些情况适合使用工作线程 + 队列服务器?是否有任何指南可以将这些内容包含在我的应用程序中?最近我在我的应用程序中包含了 memcache,这非常简单。我只是用 memcache 处理程序包装了我的 sql 查询。
I've built a PHP app, and I've read that it's a best-practice to use a 'worker' + queue server when calling api's or performing operations that are time consuming.
A quick search for a tutorial has turned up dry. I've built my app using codeigniter, and I do make various calls to the facebook api + use php-based image manipulation throughout my app. The only thing I wonder is how could a queue server+worker help me if I'm performing api calls or resizing my image and the user would normally not care to get a response back from my server until it's completed.
What situations would be good candidates for a worker + queue server, and are there any guides out there for including these in my application? Recently I've included memcache in my app, a that was trivially easy. I simply wrapped my sql queries with a memcache handler.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在您描述的示例(图像调整大小)中,您基本上在调整图像大小期间保持 Apache 连接打开。 Apache 进程非常昂贵,为了使您的系统尽可能具有可扩展性,您应该致力于使 Web 请求/响应尽可能短。
另一个想法是使用队列可以控制并发。如果 100 多个用户同时上传图片并调整大小怎么办?你的服务器能处理吗?如果您有一个工作(后端)服务器来处理这些请求,那么您将能够仅允许执行 X 个并发作业。
这同样适用于 Web 服务请求:您基本上将 Web 服务调用的执行卸载到工作进程,而不是保持打开的连接,这释放了 apache 进程,并且您可以实现 AJAX 轮询机制来检查是否后端服务器向Web服务发出的请求已完成。从长远来看,系统将具有更好的扩展性,并且用户通常不喜欢等待操作完成而没有关于操作位置的反馈。排队允许您异步执行任务并向访问者提供有关任务完成状态的反馈。
我通常使用 Zend Server 的作业队列 (http://devzone.zend.com/article/11907 和 http://devzone.zend.com/article/11907)可用与 Zend Server 完整版(商业)。然而,Gearman 在这方面也很出色,并且有一个 PHP 扩展:http://php。 net/manual/en/book.gearman.php 和示例:http://www.php.net/manual/en/gearmanclient.do.php。
希望这有帮助。
--编辑--
@Casey,我开始添加评论,但意识到这很快就会变得太长,所以我编辑了答案。我刚刚阅读了云控制的文档,这是我不知道的服务。然而幸运的是,我已经相当广泛地使用了 Codeigniter,所以我将尝试为您破解一个答案:
1-Cloudcontrol 的工作人员概念是从命令行启动 php 脚本。因此,您需要一种方法让 Codeigniter 接受从命令行触发脚本并将其分派到控制器。您可能希望将其限制为一个控制器。请参阅代码:http://pastebin.com/GZigWbT3
该文件本质上与 CI 的 index.php 文件的作用相同,只是它通过设置 $_REQUEST['SERVER_URI'] 来模拟请求。请务必将该文件放置在文档根目录之外,并相应地调整
$system_folder
变量。2- 您的控制器文件夹中需要有一个控制器 script.php,您可以从中禁用 Web 请求。您可以做一些事情来达到以下效果:
3-最后一部分是让您在 CI 中开发一个包装器库(在您的 system/application/libraries 文件夹中),它将有效地包装 CloudController 的工作调用的功能
4- 现在,从代码中实际想要对作业进行排队的任何位置(如果来自控制器):
Pelase 请注意:
1- 这可以进一步完善,这实际上是为了让您了解如何完成它
2-由于我没有云控制器帐户,因此我无法测试代码,因此可能需要调整。我在项目中使用的utilities.phph脚本所以这个应该很好。
祝你好运!
In the example that you described (image resizing) you basically keep an Apache connection open for the duration of the time it takes to resize your image. Apache processes are expensive and in order to make your system as scalable as possible you should aim to keep your web requests/responses as short as possible.
The other idea is that with a queue you can control concurrency. What if 100+ users upload an image to resize at the same time? can your server handle it? If you had a worker (backend) server to handle these requests, then you'd be able to allow the execution of only X concurrent jobs.
Same applies for web services requests: instead of having a connection that stays open, you basically offload the execution of the web service call to a worker process, this frees up an apache process, and you can implement an AJAX polling mechanism that checks if the request that the backend server issued to the web service completed. On the long run the system will scale better, and users usually don't like to wait for an operation to complete with no feedback on where it's at. Queuing allows you to asynchronously execute a task and provide your visitor with feedback on where the completion status of a task.
I typically work with Zend Server's Job queue (http://devzone.zend.com/article/11907 and http://devzone.zend.com/article/11907) that is available with Zend Server full edition (commercial). However, Gearman is also excellent at doing that and has a PHP extension: http://php.net/manual/en/book.gearman.php and an example: http://www.php.net/manual/en/gearmanclient.do.php.
Hope this helps.
--EDIT--
@Casey, I started out adding a comment, but realized this is quickly going to become too long an answer, so I edited the answer instead. I just read the doc for cloud control which is a service I did not know. However luckily I have used Codeigniter quite extensively, so I'll try to hack an answer for you:
1- Cloudcontrol's concept of a worker is to launch a php script from the command line. Therefore you need a way for Codeigniter to accept firing a script from the command line and making it dispatch to a controller. You will probably want to limit that to one controller. See the code at: http://pastebin.com/GZigWbT3
This file does in essence what CI's index.php file does, except it emulates a request through setting
$_REQUEST['SERVER_URI']
. Be sure to place that file outside of your document root, and adjust the$system_folder
variable accordingly.2- You need a controller script.php in your controllers folder, from which you will disable web requests. You can do something to the effect of:
3- The last piece is for you to develop a wrapper library in CI (in your system/application/libraries folder) which would effectively wrap the functionality of CloudController's worker invocation
4- Now from anywhere in your code where you actually want to queue a job, if from a controller:
Pelase note that:
1- This could be polished further, it was really to give you an idea of how it could get done
2- Since I have no cloudcontroller account, I have no way of testing the code, so it might need tweaking. The utilities.phph script I use in my projects so this one should be good.
Good luck!
如果您不需要专用的工作/队列服务器设置,您可以为您的 codeigniter 安装创建一个小型库来管理简单的工作队列。
在初始客户端请求期间,您检查生成的图像或缓存中的远程文件是否不需要(重新)生成,并提供文件。如果需要构建文件或图像,您可以告诉队列库将其添加到队列中,然后关闭与浏览器的连接。 但是,在同一请求期间,您仍然在控制器末尾处理队列。这样您就不需要单独的队列和工作服务器。
对我来说,关于 http://www.php.net/manual 的评论/en/features.connection-handling.php 非常有帮助。您基本上会执行如下操作:(概念证明,请参阅链接了解详细信息)
在开发和调试期间,您可以暂时禁用关闭连接部分并查看任何输出。对于生产,您可以使用 log_message()。
队列库功能(编码器/自身注释):将文件添加到队列时,队列库应检查该文件是否已在队列中。因为在此设置中,工作人员异步运行(许多不同的浏览器连接),所以当工作人员开始处理作业时,它应该将作业状态设置为“正在处理”之类的内容,以便其他工作人员不会开始处理同一作业。或者,您可以通过将整体队列状态设置为“队列正在处理”(一次一个工作人员)来设置顺序队列。作业(或整体队列)的超时可能也是一个好主意,并且超时应该比 set_time_limit() 稍大一些。这样您就可以知道作业何时可能失败并更新错误日志。尽早处理队列清理,以确保它们得到处理并且不会超出任何超时范围。
注意:在同一个链接页面中,如果您对本地文件系统中的文件进行操作,同时想要使用ignore_user_abort(true)或register_shutdown_function(),那么存储工作目录似乎是明智的第一的。 $cwd = getcwd();
编辑:
为工作库找到了一个很好的起点:
http://www.andy-russell.com/job-scheduler-library
If you don't require a dedicated worker/queue server setup, you can make a small library for your codeigniter installation to manage a simple working queue.
During the initial client request you check that the generated image, or the remote file in cache don't need to be (re)generated, and serve the files. If the file or image needs to be build you tell the queue-library to add it to the queue, and then close the connection to the browser. However, you still process the queue at the end of your controller, during that same request. This way you don't need a separate queue and worker server.
For me, the comments on http://www.php.net/manual/en/features.connection-handling.php where very helpful. You basically do something like the following: (proof of concept, see link for details)
During development and debugging, you can temporarily disable the close-connection part and see any output. For production you could use log_message().
Queue-library functionality (notes to coder/self) : When adding a file to the queue, the queue-library should check if the file might already be in the queue. Because in this setup the workers run asynchronously (many different browser connections), when a worker starts processing on a job, it should set the job-status to something like 'processing', so that no other worker will start working on the same job. Alternatively, you could set up a sequential queue by setting the overall queue status to 'queue-is-processing' (one worker at a time). Timeouts for jobs (or for the overall-queue) are probably a good idea too, and the timeout should be a bit bigger than set_time_limit(). This way you're able to know when a job might have failed and update a error-log. Process queue-cleanups early on, to make sure they are processed and don't fall outside any timeout.
Note: From that same linked page, if you act on files in the local filesystem, and at the same time want to use ignore_user_abort(true) or register_shutdown_function(), it seems wise to store the working directory first. $cwd = getcwd();
edit:
found a good starting point for a job-library:
http://www.andy-russell.com/job-scheduler-library