作业队列 - cron 可以工作吗?

发布于 2024-12-09 17:27:30 字数 379 浏览 0 评论 0原文

我正在构建一个小型应用程序,要求人们通过电子邮件上传图像。它是用 PHP(无框架)、MySQL 和 S3 构建的。

到目前为止,在我的场景中:电子邮件存储在 POP3 帐户中。脚本每分钟运行一次,获取最旧的电子邮件,调整图像大小,将其上传到 S3,将其路径存储在数据库中,删除电子邮件。

在更大的范围内,如何管理? cron 作业是处理这种情况的最佳方法吗?如果该过程花费超过一分钟怎么办:它会重叠并最终失败,对吗?或者如果不到一分钟怎么办?考虑到我每小时会收到超过 60 个更大规模的请求,我会得到不必要的空闲时间...

也许我应该使用 .forward 文件来处理电子邮件,但我同样不会控制流程。

我相信大多数场景都是有效的,我只是对最佳实践感到好奇。

谢谢!

I'm building a small application that requires people to upload images by email. It is built in PHP (no framework) with MySQL and S3.

So far, in my scenario: emails are stored on a POP3 account. A script runs every minute, fetches the oldest email, resizes the image, uploads it to S3, store its path in the DB, deletes the email.

In a larger scale, how would this be managed? Is the cron job the best way to handle this type of situation? What if the process takes more than a minute: it will overlap and eventually fail, right? Or what if it takes less than a minute? I'd get unwanted idle time considering I would have more than 60 requests an hour in a bigger scale...

Perhaps I should use a .forward file to process emails, but again I would not control the flow.

I believe most of these scenarios work, I'm just curious regarding best practices.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

断桥再见 2024-12-16 17:27:31

稍微修改一下的方法可能是:

  • 每分钟从 cron 运行您的脚本
  • 您的脚本检查它的另一个实例是否已经在运行,如果是的话就存在
  • 正在运行的脚本处理剩余的队列,直到为空或多个元素(例如,最大 10)

我曾经有一个备份程序,如果上次完成的备份早于一定时间,则每小时备份一次客户目录。这种方法非常有效,直到某些客户拥有太多 GB 的数据并且脚本需要一个多小时才能完成备份。

如果没有检查,下一小时脚本将再次运行同一客户,这也将花费一个多小时,依此类推,直到机器在非常高的运行级别下变得无响应。

我实施的修复是所描述的检查,如果另一个实例正在运行,只需退出并等待下一个周期。修复之后,我多年来从未遇到过任何问题。

A slightly modified approach could be:

  • Run your script from the cron every minute
  • Your script check if another instance of it is already running and if it is the case just exists
  • The running script process the remaining queue until is empty or a number of elements (ex. max 10)

I once had a backup procedure that was backing up customer directories every hour, if the last completed backup was older than a certain amount of time. This worked great until some customer had too many Gb of data and the script was taking more than a hour to do the backup.

Without the check, the next hour the script is going to run again the same customer, and that would take also more than a hour, and so on, until the machine becomes unresponsive with a very high runlevel.

The fix that I've implemented was the check described, if another instance was running, just exit and wait for the next cycle. After that fix I never had a problem for years.

网白 2024-12-16 17:27:31

尝试有一个长期运行的过程。它检查邮件并处理所有邮件。如果完成后找不到更多邮件,则会休眠一分钟。

如果您遇到稳定性问题,可以随时使用supervise之类的工具。

Try having a long-running process. It checks for mail, and processes all of it. If it can't find any more mail when it's done, then it goes to sleep for a minute.

If you have issues with stability, you can always use something like supervise.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文