监控：满足服务测试操作条件后延迟下一个监控周期

发布于 2024-11-09 09:59:30 字数 387 浏览 3 评论 0原文

当我的服务器进入高负载时，Apache 的正常重启似乎可以让事情重新得到控制。因此，我使用以下配置设置了 monit：

set daemon 10
check system localhost
      if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"

因此，每 10 秒，我轮询一次服务器负载，当负载超过 5 时，我会优雅地重新启动 Apache。然而，这暂时增加了负载，因此我们陷入了死亡螺旋。我想要的是让它在 10 秒后注意到负载为 5 或更多，并优雅地重新启动 Apache，然后等待 5 分钟左右，然后再次检查该特定指标。

有没有办法用 monit 来做到这一点？

原文

When my server gets into high load, a graceful restart of Apache seems to bring things back under control. So I set up monit, with this configuration:

set daemon 10
check system localhost
      if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"

So every 10 seconds, I poll the server load, and when it gets above 5, I gracefully restart Apache. However, that temporarily raises the load, so we get into a death spiral. What I want is for it to notice after 10 seconds that the load is 5 or more, and gracefully restart Apache, then wait for 5 minutes or so before checking that particular metric again.

Is there a way to do this with monit?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

暗地喜欢 2024-11-16 09:59:30

它不完全在 monit 内，但它足够接近

set daemon 10
check system localhost
  if loadavg (1min) > 5 then unmonitor
  if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"
  if loadavg (1min) > 5 then exec "python /scripts/remonitor.py"

然后你有一个 python 脚本，如下所示：

import time, os

time.sleep(5*60)
os.system("monit monitor system")

所以这将：
1. 当“系统”负载过高时，取消对其进行监控，以防止死亡螺旋
2. 优雅地重启apache
3.启动5分钟后重新监控“系统”的脚本

It's not entirely within monit, but it's close enough

set daemon 10
check system localhost
  if loadavg (1min) > 5 then unmonitor
  if loadavg (1min) > 5 then exec "/etc/init.d/apache2 graceful"
  if loadavg (1min) > 5 then exec "python /scripts/remonitor.py"

Then you have a python script, like so:

import time, os

time.sleep(5*60)
os.system("monit monitor system")

So this will:
1. unmonitor "system" when it reaches too much load, to prevent the death spiral
2. restart apache gracefully
3. start the script that will re-monitor the "system" in 5 minutes

回复收藏 0 原文

故乡的云 2024-11-16 09:59:30

，

set daemon 10

set limits { programtimeout: 300 seconds }

check system localhost
   if loadavg (1min) > 5 then exec "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'"

或者甚至是

set daemon 10

check system localhost
   start program = "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'" with timeout 330 seconds
   if loadavg (1min) > 5 then start

只需在重新启动 Apache 的命令后添加 sleep 5m shell 命令并向监视器添加适当的超时即可。

What about

set daemon 10

set limits { programtimeout: 300 seconds }

check system localhost
   if loadavg (1min) > 5 then exec "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'"

or even

set daemon 10

check system localhost
   start program = "/bin/sh -c '/etc/init.d/apache2 graceful && sleep 5m'" with timeout 330 seconds
   if loadavg (1min) > 5 then start

I.e., just add the sleep 5m shell command after the command to restart Apache and add the appropriate timeout to the monitrc.

回复收藏 0 原文

~没有更多了~