Nagios超时配置
如何为每个服务检查设置单独的超时设置。根据主配置,所有超时默认为 60 秒,但由于执行时间的原因,我要求一项特定检查具有更长的超时时间。
这怎么能做到呢?请帮忙。
谢谢
How do I set an individual timeout setting per service check. All timeouts default to 60 seconds as per the main configuration but I require that one particular check have a longer timeout due to the execution time.
How can this be done? Please help.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
解决此问题的最佳方法是将检查从“主动”执行移至“被动”执行。 Nagios 系统的被动检查根本没有任何时间限制。您必须根据需要将任何超时合并到被动检查脚本/程序本身中。这将允许您完全控制检查应该运行多长时间 - 甚至可以在需要时持续运行守护进程检查。
全局超时的主要目的是防止出现我称之为“检查堆栈”的情况。如果全局超时较高(尤其是高于最小可能检查间隔的超时),您可能会面临在旧检查完成之前触发新检查的风险。这些可能会导致各种情况,从 Nagios 系统上的资源耗尽到给目标机器和/或应用程序带来压力。因此,增加这些全局超时通常是一个坏主意。当达到这些超时时,将终止分叉检查过程。
The best way of fixing this issue would be to move the check from 'active' to 'passive' execution. Passive checking does not have any time limits from the Nagios system at all. You would have to incorporate any timeouts, as required, into the passive check script/program itself. This would allow you to have complete control over how long a check should run - and even have daemonized checks that constantly run if this was needed.
The main purpose of the global timeouts is to prevent a condition I call 'check-stacking'. With a higher global timeout (especially one that is higher than the smallest possible check interval), you risk firing off a new check before the old one has finished. These can lead to everything from, running out of resources on your Nagios system - to stressing your target machine and or application. So its generally a bad idea to increase those global timeouts. These timeouts, when reached, will kill off the forked check process.
正如文档中所述,此超时是控制服务的最后努力检查行为不正常的情况。如果您知道您的检查出于充分原因需要更长的时间,那么我建议在 nagios.cfg 中提高此限制。没有任何服务或主机检查设置可以覆盖此设置。
As noted in the documentation this timeout is a last ditch effort to control service checks that are not behaving properly. If you know that you have a check that will take longer for a good reason then I suggest raising this limit in nagios.cfg. There is no setting for service or host checks that will override this.
编辑:我刚刚意识到你并不是在谈论远程服务检查,所以不幸的是,我需要将我的答案更改为“你不能”。如果要更改服务检查的超时设置,则必须将其应用于主配置文件中的所有服务检查。
您需要定义第二个命令参数,该参数使用您的特殊超时设置。
例如,这可能是您在commands.cfg中的原始检查:
这将是具有较长超时值的相同命令(也在commands.cfg中):
如果有更好的方法来执行此操作,请告诉我nagios-wizards !这将为我自己的配置文件节省大量空间。
EDIT: I just realized you weren't talking about JUST remote service checks so, unfortunately, I need to change my answer to "you can't". If you want to change the timeout settings for service checks then you must apply it to all service checks in the main configuration files.
You'll need to define a second command argument, one that uses your special timeout setting.
For example, this may be your original check in commands.cfg:
And this would be your identical command with a longer timeout value (also in commands.cfg):
If there is a better way to do this nagios-wizards please let me know! It would save a lot of room in my own configuration files.