监控子进程的内存使用情况
我有一个 Linux 守护进程,它分叉几个子进程并监视它们是否崩溃(根据需要重新启动)。 如果父进程可以监视子进程的内存使用情况,以检测内存泄漏并在超出一定大小时重新启动子进程,那就太好了。 我该怎么做?
I have a Linux daemon that forks a few children and monitors them for crashes (restarting as needed).
It will be great if the parent could monitor the memory usage of child processes - to detect memory leaks and restart child processes when the go beyond a certain size.
How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该能够从 /proc/{PID}/status 中获取详细的内存信息:
但是,除非内存泄漏非常严重,否则很难通过查看进程统计信息来检测它们,因为 malloc 和 free 通常对系统调用相当抽象( brk/sbrk) 对应的。
您还可以检查/proc/${PID}/statm。
You should be able to get detailed memory information out of /proc/{PID}/status:
However, unless memory leaks are dramatic, it's difficult to detect them looking at process statistics, because malloc and free are usually quite abstract from system calls (brk/sbrk) to which they correspond.
You can also check into /proc/${PID}/statm.
您可以尝试让监控脚本与您的进程并行运行 vmstat(请注意,如果您多次运行此脚本,这不是一个好主意,因为您将获得多个 vmstat 副本)。然后,此监视脚本可以使用可用内存加上缓冲区和高速缓存大小来获取操作系统可用的内存量,您可以对其进行跟踪。然后,如果低于某个阈值,您可以通过调用 ps -e -o... 检查最大的进程(有关详细信息,请参阅手册页,但尝试从 vsz、pcpu、user、pid、args 作为起点)。
我建议将此监视器作为一个单独的进程运行,并在它变得太大时杀死流氓进程。您可以使用 ps 参数来限制监视的进程集
。
不过,这完全是一个黑客行为(英国的意思) - 正确的解决方案是修复泄漏,假设你有代码。
you could try having a monitor script running vmstat in parallel with your process (note this is not a good idea if you're running this script multiple times as you'll get multiple vmstat copies). Then this monitor script can take the free memory plus the buffer and cache size to get the amount of memory that the OS has available and you can track that. Then if that gets below some threshold you can check for the biggest processes by calling ps -e -o... (see man page for details but try vsz,pcpu,user,pid,args as a starting point).
I'd advise running this monitor as a separate process and having it kill the rogue process when it gets too large. You could restrict the set of processes monitored by using the
parameter to ps.
This is all a hack (UK meaning) though - the right solution though is to fix the leaks, assuming you have the code.