如何使用 PHP 设置 Beanstalkd

发布于 2024-12-09 07:21:19 字数 1087 浏览 2 评论 0 原文

最近我一直在研究 Beanstalkd 与 PHP 的使用。我已经学到了很多东西,但对服务器上的设置等有一些疑问。

以下是我如何看待它的工作原理:

  1. 我在我的 Ubuntu 服务器上安装 Beanstalkd 和任何依赖项(例如 libevent)。然后我启动 Beanstalkd 守护进程(它基本上应该始终运行)。
  2. 在我网站的某个地方(例如当用户执行某些操作时等)任务被添加到 Beanstalkd 队列中的各个管中。
  3. 我有一个 bash 脚本(例如下面的脚本),它作为守护进程运行,基本上执行 PHP 脚本。

    <前><代码>#!/bin/sh php 工人.php

4)工作脚本将有类似这样的东西来执行排队的任务:

while(1) {
  $job = $this->pheanstalk->watch('test')->ignore('default')->reserve();
  $job_encoded = json_decode($job->getData(), false);
  $done_jobs[] = $job_encoded;
  $this->log('job:'.print_r($job_encoded, 1));
  $this->pheanstalk->delete($job);
}

现在这是我基于上述设置的问题(如果我错了,请纠正我):

  1. 假设我有导入的任务RSS 提要到数据库或其他东西中。如果 10 个用户同时执行此操作,他们都会在“测试”管中排队。然而,他们一次只会被处决一个。让 10 个不同的管同时执行会更好吗?

  2. 如果我确实需要更多管子,是否也意味着我需要 10 个工作脚本?每个管子都有一个,除了 watch() 函数中的字符串文字之外,都使用基本相同的代码同时运行。

  3. 如果我将该脚本作为守护进程运行,它是如何工作的?它会不断执行worker.php脚本吗?理论上,该脚本会循环直到队列为空,所以它不应该只启动一次吗?守护进程如何决定执行worker.php的频率?这只是一个设置吗?

谢谢!

Recently I've been researching the use of Beanstalkd with PHP. I've learned quite a bit but have a few questions about the setup on a server, etc.

Here is how I see it working:

  1. I install Beanstalkd and any dependencies (such as libevent) on my Ubuntu server. I then start the Beanstalkd daemon (which should basically run at all times).
  2. Somewhere in my website (such as when a user performs some actions, etc) tasks get added to various tubes within the Beanstalkd queue.
  3. I have a bash script (such as the following one) that is run as a deamon that basically executes a PHP script.

    #!/bin/sh
    php worker.php
    

4) The worker script would have something like this to execute the queued up tasks:

while(1) {
  $job = $this->pheanstalk->watch('test')->ignore('default')->reserve();
  $job_encoded = json_decode($job->getData(), false);
  $done_jobs[] = $job_encoded;
  $this->log('job:'.print_r($job_encoded, 1));
  $this->pheanstalk->delete($job);
}

Now here are my questions based on the above setup (which correct me if I'm wrong about that):

  1. Say I have the task of importing an RSS feed into a database or something. If 10 users do this at once, they'll all be queued up in the "test" tube. However, they'd then only be executed one at a time. Would it be better to have 10 different tubes all executing at the same time?

  2. If I do need more tubes, does that then also mean that i'd need 10 worker scripts? One for each tube all running concurrently with basically the same code except for the string literal in the watch() function.

  3. If I run that script as a daemon, how does that work? Will it constantly be executing the worker.php script? That script loops until the queue is empty theoretically, so shouldn't it only be kicked off once? How does the daemon decide how often to execute worker.php? Is that just a setting?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

紧拥背影 2024-12-16 07:21:19
  1. 如果工作人员获取 feed 的时间不是太长,那就没问题。如果需要一次处理多个工作程序,您可以运行多个工作程序。我有一个系统(目前使用 Amazon SQS,但我之前曾使用 BeanstalkD 做过类似的事情),最多有 200 个(或更多)工作人员从队列中拉出。
  2. 单个工作脚本(同一脚本运行多次)应该没问题 - 该脚本可以同时监视多个管,并且第一个可用的将被保留。您还可以使用 job-stat 命令查看特定 $job 来自何处(哪个管道),或者如果您需要区分每种类型,则可以在消息中放入一些元信息。
  3. 运行工作线程的一个很好的例子是 supervisord (还有 有用的帖子 开始)轻松启动并保持每台机器运行多个工作进程(我运行 shell 脚本,如 第一个链接)。我会限制它循环的次数,并在reserve()中放入一个数字,让它等待几秒钟或更长时间,以便下一个作业变得可用,而不会退出控制在一个紧密的循环中,根本不会暂停——即使没有什么可做的。

附录:

  1. shell 脚本将根据需要运行多次。 (该链接显示了如何根据需要使用 exec $@ 重新运行它)。每当 php 脚本退出时,它都会重新运行 PHP。
  2. 显然有一个 Djanjo 应用程序可以显示一些统计数据,但它足以连接到守护进程,获取管列表,然后获取每个管的统计数据 - 或者只是计数。
  1. If the worker isn't taking too long to fetch the feed, it will be fine. You can run multiple workers if required to process more than one at a time. I've got a system (currently using Amazon SQS, but I've done similar with BeanstalkD before), with up to 200 (or more) workers pulling from the queue.
  2. A single worker script (the same script running multiple times) should be fine - the script can watch multiple tubes at the same time, and the first one available will be reserved. You can also use the job-stat command to see where a particular $job came from (which tube), or put some meta-information into the message if you need to tell each type from another.
  3. A good example of running a worker is described here. I've also added supervisord (also, a useful post to get started) to easily start and keep running a number of workers per machine (I run shell scripts, as in the first link). I would limit the number of times it loops, and also put a number into the reserve() to have it wait for a few seconds, or more, for the next job the become available without spinning out of control in a tight loop that does not pause at all - even if there was nothing to do.

Addendum:

  1. The shell script would be run as many times as you need. (the link show how to have it re-run as required with exec $@). Whenever the php script exits, it re-runs the PHP.
  2. Apparently there's a Djanjo app to show some stats, but it's trivial enough to connect to the daemon, get a list of tubes, and then get the stats for each tube - or just counts.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文