监控无人值守的批处理程序
目前,我们有一个 24/7 运行的批处理程序。它实际上测试了几个页面,如果发现页面上有任何错误,它只会向我们发送电子邮件。如果没有电子邮件,我们假设该程序仍在运行。
话虽如此,我们实际上需要一个服务(也许)或其他方式来知道程序是否已停止运行。该程序安装在 24/7 开放的测试机中。目前,我们正在考虑某种推送监控服务。我们的程序将对第三方系统进行 ping 操作,如果它没有收到预期的 ping 操作,则会向我们发出警报。这样的服务你知道吗?或者你能推荐其他选择吗?谢谢!
Currently, we have a batch program that is running 24/7. It actually tests several pages and it just sends an email to us if it finds any error on the page. If there are no emails, we assume that the program is still running.
Having said that, we actually need a service (perhaps) or another way to know if the program has stopped running. The program is installed in a Test Machine that is open 24/7. Currently, we're thinking about some kind of Push monitoring service ex. a third system party will be pinged by our program and if it does not receive the expected ping, it will alert us. Do you know such service? Or can you recommend other options? Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
监视脚本的最佳方法是让它定期记录其状态和/或检查点到文件。您的脚本的每个阶段/主要迭代都会记录到文件或向系统日志提交消息。或者,如果您的批处理脚本足够频繁地迭代特定的代码点,您可以插入一个运行状况检查计时器。当发生指定的超时时,您将把一条消息放入日志文件中。
伪代码可能如下所示。
或者,您可以更改 check_timeout 例程以将消息转发到监控系统,例如 Zabbix 使用 zabbix_sender 用当前时间更新项目。然后,如果上次更新时间比平均签入间隔大 1.5 倍或更多倍(取决于您的平均负载,但可能存在时间差异),您将编写一个触发器来激活。
About your best way to monitor the script is to have it log its status and or checkpoint to a file periodically. Each phase/major iteration your script would either log to a file or submit a message to syslog. Alternatively if your batch script iterates past a specific point of code often enough you could insert a health check timer. When a specified timeout has occured you will put a message into a log file.
The pseudocode might look like this..
Alternatively you could change your check_timeout routine to forward on a message to a monitoring system such as Zabbix using the zabbix_sender to update an item with the current time. Then you would write a trigger to activate if the last time updated was 1.5 or more times greater than the average check in interval (Depends on your average load but you may have time variance).
有两种解决方案:
对于 (1),请下载 pslist 和 bmail。将它们与以下批处理脚本一起使用:
注意:您需要编辑 YOUR_BATCH_SCRIPT 和 bmail 的参数(smtpserver 等)以适合您的环境。
对于(2),您可以使用 Application Monitor 之类的实用程序来重新启动批处理程序,如果它崩溃了。
There are two solutions:
For (1), download pslist and bmail. Use them with the following batch script:
NOTE: You will need to edit YOUR_BATCH_SCRIPT and the parameters for bmail (smtpserver etc.) to suit your environment.
For (2), you can use a utility like Application Monitor to restart your batch program if it crashes.
伙计们,谢谢大家的回复,我非常感谢你们的帮助。无论如何,我回来通知您(以及其他可能有或将有相同需求的人)我已经找到了符合我要求的服务。我现在正在使用免费的 Pushmon 服务。它实际上即将启动,但我已经通过邀请码进行了尝试。我已经使用它以及我们新的预定测试程序几个星期了,到目前为止,它还没有让我失望。
guys, thank you all for your responses and I'm just grateful for all your help. Anyways, I came back to inform you (and others that may have and will have the same need) that I already found the service that fit my requirements. I'm now using the free Pushmon service. It's actually about to launch but I've already tried it via an invite code. I've been using it for several weeks already along with our new scheduled testing programs and so far, it hasn't failed me yet.