调试监控
我发现调试 monit 是一件很痛苦的事情。 Monit 的 shell 环境基本上什么都没有(没有路径或其他环境变量)。另外,我找不到日志文件。
问题是,如果 monit 脚本中的启动或停止命令失败,则很难辨别出问题所在。很多时候,它并不像在 shell 上运行命令那么简单,因为 shell 环境与 monit shell 环境不同。
人们使用哪些技术来调试监控配置?
例如,我很乐意拥有一个 monit shell 来测试我的脚本,或者一个日志文件来查看出了什么问题。
I find debugging monit to be a major pain. Monit's shell environment basically has nothing in it (no paths or other environment variables). Also, there are no log file that I can find.
The problem is, if the start or stop command in the monit script fails, it is difficult to discern what is wrong with it. Often times it is not as simple as just running the command on the shell because the shell environment is different from the monit shell environment.
What are some techniques that people use to debug monit configurations?
For example, I would be happy to have a monit shell, to test my scripts in, or a log file to see what went wrong.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我也遇到过同样的问题。使用 monit 的详细命令行选项会有所帮助,但我发现最好的方法是创建一个与 monit 环境尽可能相似的环境,并从那里运行启动/停止程序。
我发现最常见的问题与环境变量相关(尤其是
PATH
)或与权限相关。您应该记住 monit 通常以 root 身份运行。另外,如果您在 monit 配置中使用
as uid myusername
,那么您应该在执行测试之前更改为用户myusername
。I've had the same problem. Using monit's verbose command-line option helps a bit, but I found the best way was to create an environment as similar as possible to the monit environment and run the start/stop program from there.
I've found the most common problems are environment variable related (especially
PATH
) or permission-related. You should remember that monit usually runs as root.Also if you use
as uid myusername
in your monit config, then you should change to usermyusername
before carrying out the test.在让 monit 处理一切之前,请务必仔细检查您的配置并手动监视您的进程。 sysstat(1)、top(1) 和 ps(1) 是您了解资源使用情况和限制的好帮手。了解您监控的过程也很重要。
关于启动和停止脚本,我使用包装脚本来重定向输出并检查环境和其他变量。像这样的事情:
然后在 monit 中:
您仍然必须弄清楚您想要在包装器中包含哪些信息,例如进程信息、id、系统资源限制等。
Be sure to always double check your conf and monitor your processes by hand before letting monit handle everything. systat(1), top(1) and ps(1) are your friends to figure out resource usage and limits. Knowing the process you monitor is essential too.
Regarding the start and stop scripts i use a wrapper script to redirect output and inspect environment and other variables. Something like this :
Then in monit :
You still have to figure out what infos you want in the wrapper, like process infos, id, system resources limits, etc.
您可以通过将
MONIT_OPTS="-v"
添加到/etc/default/monit
来以详细/调试模式启动 Monit(不要忘记重新启动;/ etc/init.d/monit restart
)。然后,您可以使用
tail -f /var/log/monit.log
捕获输出You can start Monit in verbose/debug mode by adding
MONIT_OPTS="-v"
to/etc/default/monit
(don't forget to restart;/etc/init.d/monit restart
).You can then capture the output using
tail -f /var/log/monit.log
monit -c /path/to/your/config -v
monit -c /path/to/your/config -v
默认情况下,monit 会记录到您的系统消息日志中,您可以在那里查看发生了什么。
此外,根据您的配置,您可能会登录到不同的位置
http://mmonit。 com/monit/documentation/monit.html#LOGGING
假设默认值(无论我使用什么旧版本的 monit),您可以这样跟踪日志:
CentOS:
Ubuntu:
Mac OSX
Windows
这里是 Dragons
但我在出于病态的好奇心搜索如何做到这一点时发现了一个 neato 项目:https://github.com/derFunk/monit-windows-agent
By default, monit logs to your system message log and you can check there to see what's happening.
Also, depending on your config, you might be logging to a different place
http://mmonit.com/monit/documentation/monit.html#LOGGING
Assuming defaults (as of whatever old version of monit I'm using), you can tail the logs as such:
CentOS:
Ubuntu:
Mac OSX
Windows
Here be Dragons
But there is a neato project I found while searching on how to do this out of morbid curiosity: https://github.com/derFunk/monit-windows-agent
是的,monit 不太容易调试。
这里有一些最佳实践,
shell:
这很有帮助。
我发现另一件有帮助的事情是使用“-v”运行 monit,这会给你带来冗长的信息。因此,工作流程是
Yeah monit isn't too easy to debug.
Here a few best practices
shell:
That helps a lot.
The other thing I find that helps is to run monit with '-v', which gives you verbosity. So the workflow is
您还可以尝试在进程运行后运行 monit validate,以尝试找出其中是否有任何问题(如果有任何问题,有时会获得比在日志文件中获得的信息更多的信息)。除此之外,您无能为力。
You can also try running monit validate once processes are running, to try and find out if any of them are having problems (and sometimes get more information than you would get in the log files if there are any problems). Beyond that, there's not much more you can do.