检查 Linux 中给定进程的打开 FD 限制
我最近有一个 Linux 进程“泄露”了文件描述符:它打开了文件描述符,但没有正确关闭其中一些文件描述符。
如果我对此进行监控,我就可以提前得知该过程已达到其极限。
有没有一种很好的 Bash 或 Python 方法来检查 Ubuntu Linux 系统中给定进程的 FD 使用率?
编辑:
我现在知道如何检查有多少个打开的文件描述符;我只需要知道一个进程允许有多少个文件描述符。某些系统(例如 Amazon EC2)没有 /proc/pid/limits
文件。
I recently had a Linux process which “leaked” file descriptors: It opened them and didn't properly close some of them.
If I had monitored this, I could tell – in advance – that the process was reaching its limit.
Is there a nice, Bash or Python way to check the FD usage ratio for a given process in a Ubuntu Linux system?
EDIT:
I now know how to check how many open file descriptors are there; I only need to know how many file descriptors are allowed for a process. Some systems (like Amazon EC2) don't have the /proc/pid/limits
file.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
计算
/proc//fd/
中的条目数。适用于进程的硬限制和软限制可以在/proc//limits
中找到。Count the entries in
/proc/<pid>/fd/
. The hard and soft limits applying to the process can be found in/proc/<pid>/limits
.Linux 内核提供的获取资源限制的唯一接口是
getrlimit()
和/proc/
pid/limits
。getrlimit()
只能获取调用进程的资源限制。/proc/
pid/limits
允许您获取具有相同用户 ID 的任何进程的资源限制,并且在 RHEL 5.2 上可用, RHEL 4.7、Ubuntu 9.04 以及具有 2.6.24 或更高版本内核的任何发行版。如果您需要支持较旧的 Linux 系统,那么您必须让进程本身调用
getrlimit()
。当然,最简单的方法是修改程序或其使用的库。如果您正在运行该程序,则可以使用LD_PRELOAD
将您自己的代码加载到该程序中。如果这些都不可能,那么您可以使用 gdb 附加到进程并让它在进程内执行调用。您也可以使用 ptrace() 自己执行相同的操作来附加到进程、将调用插入到其内存中等,但是这样做非常复杂,因此不建议这样做。有了适当的权限,执行此操作的其他方法将涉及查看内核内存、加载内核模块或以其他方式修改内核,但我假设这些都是不可能的。
The only interfaces provided by the Linux kernel to get resource limits are
getrlimit()
and/proc/
pid/limits
.getrlimit()
can only get resource limits of the calling process./proc/
pid/limits
allows you to get the resource limits of any process with the same user id, and is available on RHEL 5.2, RHEL 4.7, Ubuntu 9.04, and any distribution with a 2.6.24 or later kernel.If you need to support older Linux systems then you will have to get the process itself to call
getrlimit()
. Of course the easiest way to do that is by modifying the program, or a library that it uses. If you are running the program then you could useLD_PRELOAD
to load your own code into the program. If none of those are possible then you could attach to the process with gdb and have it execute the call within the process. You could also do the same thing yourself usingptrace()
to attach to the process, insert the call in its memory, etc., however this is very complicated to get right and is not recommended.With appropriate privileges, the other ways to do this would involve looking through kernel memory, loading a kernel module, or otherwise modifying the kernel, but I am assuming that these are out of the question.
查看使用进程的前 20 个文件句柄:
输出格式为文件句柄计数、pid、cmdline,用于进程
示例输出
to see the top 20 file handle using processes:
the output is in the format file handle count, pid, cmndline for process
example output
您可以尝试编写脚本,定期在给定的 pid 上调用 lsof -p {PID} 。
You can try to write script which periodically call
lsof -p {PID}
on given pid.您要求 bash/python 方法。 ulimit 将是最好的 bash 方法(无需手动修改
/proc/$pid/fd
等)。对于 python,您可以使用资源模块。resource.getrlimit
对应于C 程序中的getrlimit
调用。结果代表所请求资源的当前值和最大值。在上面的示例中,当前(软)限制为 1024。这些值是当今 Linux 系统上的典型默认值。You asked for bash/python methods. ulimit would be the best bash approach (short of munging through
/proc/$pid/fd
and the like by hand). For python, you could use the resource module.resource.getrlimit
corresponds to thegetrlimit
call in a C program. The results represent the current and maximum values for the requested resource. In the above example, the current (soft) limit is 1024. The values are typical defaults on Linux systems these days.在 CentOS 6 及更低版本(任何使用 GCC 3 的系统)中,您可能会发现调整内核限制并不能解决问题。这是因为在编译时设置了一个 FD_SETSIZE 值海湾合作委员会正在使用。为此,您需要增加该值,然后重新编译该过程。
此外,您可能会发现由于 libpthread 中的已知问题 如果您正在使用该库。此调用已集成到 GCC 4 / CentOS7 / RHEL 7 中的 GCC 中,这似乎解决了线程问题。
In CentOS 6 and below (anything using GCC 3), you may find that adjusting the kernel limits does not resolve the issue. This is because there is a FD_SETSIZE value that is set at compile time in use by GCC. For this, you will need to increase the value and then re-compile the process.
Also, you may find that you are leaking file descriptors due to known issues in libpthread if you are using that library. This call was integrated into GCC in GCC 4 / CentOS7 / RHEL 7 and this seems to have fixed the threading issues.
使用优秀的 psutil 包的 Python 包装器:
Python wrapper using the excellent psutil package: