在 Windows 上运行 PHP 应用程序 - 守护进程还是 cron?
我需要一些实施建议。我有一个 MYSQL 数据库,将被远程写入以便在本地处理任务,并且我需要用 PHP 编写的应用程序在这些任务进入时立即执行它们。
但是当然需要告诉我的 PHP 应用程序何时运行。我考虑过使用 cron 作业,但我的应用程序位于 Windows 计算机上。其次,我需要不断地每隔几秒检查一次,而 cron 只能每分钟检查一次。
我想过编写一个 PHP 守护进程,但我正在了解它是如何工作的,以及它是否是一个好主意!
我将不胜感激任何有关最佳方法的建议。
I need some implementation advice. I have a MYSQL DB that will be written to remotely for tasks to process locally and I need my application which is written in PHP to execute these tasks imediatly as they come in.
But of course my PHP app needs to be told when to run. I thought about using cron jobs but my app is on a windows machine. Secondly, I need to be constantly checking every few seconds and cron can only do every minute.
I thought of writing a PHP daemon but I am getting consued on hows its going to work and if its even a good idea!
I would appreciate any advice on the best way to do this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
pyCron 是一个很好的 Windows CRON 替代品:
由于此任务非常简单,我只需将 pyCron 设置为每分钟运行以下脚本:
这样,如果计算机出现故障,您将最坏情况下延迟 60 秒。
您可能还想研究信号量或某种锁定策略,例如使用 APC 变量或检查锁定文件是否存在以避免竞争条件,例如使用 APC:
如果您坚持使用 PHP 守护程序,请自己执行以下操作 :赞成并放弃这个想法,改用 Gearman。
编辑:我曾经问过一个您可能感兴趣的相关问题:PHP 分布式系统剖析。
pyCron is a good CRON alternative for Windows:
Since this task is quite simple I would just set up pyCron to run the following script every minute:
This way, if the computer goes down, you'll have a worst case delay of 60 seconds.
You might also want to look into semaphores or some kind of locking strategy like using an APC variable or checking for the existence of a locking file to avoid race conditions, using APC for example:
If you're sticking with the PHP daemon do yourself a favor and drop that idea, use Gearman instead.
EDIT: I asked a related question once that might interest you: Anatomy of a Distributed System in PHP.
我会建议一些不寻常的事情:你说你需要在数据写入 MySQL 时运行该任务。这意味着 MySQL“知道”应该执行某些操作。
这听起来像是 MySQL 的 UDF sys_exec 的完美场景。
基本上,如果 MySQL 能够在发生问题时调用外部程序,那就太好了。
如果您使用提到的 UDF,您可以从内部执行 php 脚本 - 比方说,INSERT 或 UPDATE 触发器。
另一方面,您可以使其更加资源友好,并创建 MySQL 事件(假设您使用适当的版本),该事件将使用 sys_exec 调用 PHP 脚本,该脚本按预定义的时间间隔执行某些更新 - 这减少了对 Cron 或任何可以按预定义的时间间隔执行某些操作的类似程序。
I'll suggest something out of the ordinary: you said you need to run the task at the point the data is written to MySQL. That implies MySQL "knows" something should be executed.
It sounds like perfect scenario for MySQL's UDF sys_exec.
Basically, it would be nice if MySQL could invoke an external program once something happened to it.
If you use the mentioned UDF, you can execute a php script from within - let's say, INSERT or UPDATE trigger.
On the other hand, you can make it more resource-friendly and create MySQL Event (assuming you're using appropriate version) that would use sys_exec to invoke a PHP script that does certain updates at predefined intervals - that reduces the need for Cron or any similar program that can execute something at predefined intervals.
我绝对不建议为此使用 cronjobs。
cronjobs 是一件好事,对于许多用途来说非常有用且简单,但是当您描述您的需求时,我认为它们会产生比它们带来的好处更多的复杂性。这里有一些需要考虑的事情:
如果工作重叠会发生什么?执行时间超过一分钟?是否有共享资源/死锁/临时文件? - 最常见的方法是使用锁定文件,如果它在程序开始时就被占用,则停止执行。但该计划在完成之前还必须寻找更多工作。 - 然而,这在 Windows 机器上也会变得复杂,因为据我所知,它们不支持开箱即用的写锁
cronjobs 维护起来很痛苦。如果你想监视它们,你必须实现额外的逻辑,例如检查程序上次运行的时间。然而,如果您的程序仅按需运行,这可能会变得困难。最好的方法是在数据库中添加某种“作业已完成”字段,或者删除已处理的行。
在大多数基于 UNIX 的系统上,cronjobs 现在相当稳定,但是有很多情况可以破坏你的 cronjob 系统。其中大多数是基于人为错误。例如,系统管理员在编辑模式下未正确退出 crontab 编辑器可能会导致所有 cronjobs 被删除。由于上述原因,许多公司也没有适当的监控系统,一旦他们的服务出现问题就会立即通知。此时,通常没有人写下/将哪个 cronjobs 应该运行的版本控制下来,并开始疯狂猜测和重建工作。
当使用外部工具并且环境不是本机 UNIX 系统时,cronjob 维护可能会更加复杂。系统管理员必须了解更多程序,并且可能会出现潜在错误。
老实说,我认为只需从控制台启动并打开一个小脚本就可以了。
您还可以在每个循环中触摸文件(修改修改时间戳),并且您可以编写一个 nagios 脚本来检查该时间戳是否已过期,以便您知道您的作业仍在运行...
如果您希望它启动对于系统我推荐一个守护进程。
PS:在我工作的公司中,我们的网站有很多的后台活动(爬行、更新过程、计算等...),当我开始在那里时,cronjobs 真的是一团糟。它们分布在不同的服务器上,负责不同的任务。数据库通过互联网被广泛访问。大量的 nfs 文件系统、samba 共享等用于共享资源。这个地方充满了单点故障、瓶颈和不断发生故障的东西。涉及的技术太多,维护起来非常困难,当某些东西不起作用时,需要几个小时的时间来追踪问题,甚至需要另一个小时来完成该部分应该做的事情。
现在我们有一个统一的更新程序,它负责几乎所有的事情,它在多台服务器上运行,并且它们有一个定义要运行的作业的配置文件。每件事都是从一个执行无限循环的父进程分派的。它易于监控、定制、同步,一切运行顺利。冗余、同步、粒度细。所以它是并行运行的,我们可以根据需要扩展到任意数量的服务器。
我真的建议坐下来足够的时间,从整体上思考一切,并了解整个系统。然后投入时间和精力来实施一个解决方案,该解决方案将在未来发挥良好作用,并且不会在整个系统中传播大量不同的程序。
pps:
我读了很多关于 cronjobs/task 的最小间隔 1/5 分钟的内容。您可以使用接管该间隔的任意脚本轻松解决该问题:
i would definately not advise to use cronjobs for this.
cronjobs are a good thing and very useful and easy for many purposes, but as you describe your needs, i think they can produce more complications than they do good. here are some things to consider:
what happens if jobs overlap? one takes longer to execute than one minute? are there any shared resources/deadlocks/tempfiles? - the most common method is to use a lock file, and stop the execution if its occupied right at the start of the program. but the program also has to look for further jobs right before it completes. - this however can also get complicated on windows machines because they AFAIK don't support write locks out of the box
cronjobs are a pain in the ass to maintain. if you want to monitor them you have to implement additional logic like a check when the program last ran. this however can get difficult if your program should run only on demand. the best way would be some sort of "job completed" field in the database or delete rows that have been processed.
on most unix based systems cronjobs are pretty stable now, but there are a lot of situatinos where you can break your cronjob system. most of them are based on human error. for example a sysadmin not exiting the crontab editor properly in edit mode can cause all cronjobs to be deleted. a lot of companies also have no proper monitoring system for the reasons stated above and notice as soon as their services experience problems. at this point often nobody has written down/put under version control which cronjobs should run and wild guessing and reconstruction work begins.
cronjob maintaince can be further complicated when external tools are used and the environment is not a native unix system. sysadmins have to gain knowledge of more programs and they can have potential errors.
i honestly think just a small script that you start from the console and let open is just fine.
you can also touch a file (modify modification timestamp) in every loop, and you can write a nagios script that checks for that timestamp getting out of date so you know that your job is still running...
if you want it to start up with the system i recommend a deamon.
ps: in the company i work there is a lot of background activity for our website (crawling, update processes, calculations etc...) and the cronjobs were a real mess when i started there. their were spread over different servers responsible for different tasks. databases were accessed wildly accross the internet. a ton of nfs filesytems, samba shares etc were in place to share resouces. the place was full of single points of failures, bottlenecks and something constantly broke. there were so many technologies involved that it was very difficult to maintain and when something didnt work it needed hours of tracking down the problem and another hour of what that part even was supposed to do.
now we have one unified update program that is responsible for literally everyhing, it runs on several servers and they have a config file that defines the jobs to run. eveyrthing gets dispatched from one parent process doing an infinite loop. its easy to monitor, customice, synchronice and everything runs smoothly. it is redundant, it is syncrhonized and the granularity is fine. so it runs parallel and we can scale up to as many servers as we like.
i really suggest to sit down for enough time and think about everything as a whole and get a picture of the complete system. then invest the time and effort to implement a solution that will serve fine in future and doesnt spread tons of different programs throughout your system.
pps:
i read a lot about the minimum interaval of 1/5 minutes for cronjobs/tasks. you can easily work around that with an arbitrary script that takes over that interval:
这看起来像是作业服务器的作业;)看看 Gearman。这种方法的额外好处是,当且仅当有事情要做时,这是由远程端触发的,而不是轮询。特别是在小于(比方说)5 分钟的间隔内,轮询不再有效,具体取决于作业执行的任务。
This looks like a job for a job server ;) Have a look at Gearman. The additional benefit of this approach is, that this is triggered by the remote side, when and only then there is something to do, instead of polling. Especially in intervals smaller than (lets say) 5 min polling is not very effective any more, depending on the tasks the job performs.
快速而肮脏的方法是创建一个循环来不断检查是否有新工作。
伪代码
The quick and dirty way is to create a loop that continuously checks if there is new work.
Psuedo-code
您是否尝试过Windows调度程序(默认情况下Windows自带)?在此您需要提供 php 路径和您的 php 文件路径。效果很好
Have you tried windows scheduler(comes with Windows by default)? In this you will need to provide php path and your php file path. It works well
难道你不能写一个java/c++程序来通过设定的时间间隔查询你的信息吗?您可以将其包含在启动程序列表中,以便它也始终运行。一旦找到任务,它甚至可以在单独的线程上处理它,并处理更多请求并将其他请求标记为完成。
Can't you just write a java/c++ program that will query for you through a set time interval? You can have this included in the list of startup programs so its always running as well. Once a task is found, it can handle it on a separate thread even and process more requests and mark others complete.
最简单的方法是使用嵌入 Windows 日程表。
使用 php-cli.exe 运行脚本,并填充 php.ini 和脚本所需的扩展。
但我应该说,实际上您不需要这么短的时间间隔来运行您的预定作业。只需进行一些测试即可获得适合您的最佳时间间隔值。不建议设置时间间隔小于1分钟。
另一个小建议:在脚本的开头创建一些锁定文件(php集群函数),检查是否可以写入该锁定文件,以防止两个或多个副本同时工作,并在脚本结束时取消链接解锁。
如果您必须将输出结果写入数据库,请尝试使用 MySQL TRIGGERS 而不是 PHP。或者使用 MySQL 中的事件。
The most simple way is to use embed Windows schedule.
Run your script with php-cli.exe with filled php.ini with extensions your script needs.
But I should to say that in practice you don't need so short time interval to run your scheduled jobs. Just make some tests to get best time interval value for yours one. It is not recommended to setup time interval less than a 1 minute.
And another little advise: make some lock file at the beginning of your script (php flock function), check for availability to write into this lock file to prevent working of two or more copies same time and at the end of your script unlink it after unlocking.
If you have to write output result to DB try to use MySQL TRIGGERS instead of PHP. Or use events in MySQL.