如何统计Linux中多进程应用程序的CPU使用率
我尝试用 C/C++ 编写一个程序,使其表现得像 Linux 中的 top 命令。 我已经做了一些研究,并且已经知道如何计算进程的 CPU 使用率。我们可以通过 /proc/[PID]/stat 计算当前时间和几秒后的 stime + utime 来获取 CPU 使用率。然后计算 stime + utime 差值,并将结果除以正常运行时间差值,即可得到 CPU 使用率百分比。在单进程/多线程进程上会很容易。
问题出在像 httpd 这样的情况下,它作为多进程工作。当网络服务器繁忙时,httpd 将分叉子进程来服务一堆请求。然后我计算总进程数,比方说 500。我想计算这些进程的 CPU 使用率,但对它们进行汇总,所以我只看到 1 httpd CPU 使用率。但是,如果我像上面提到的那样执行算法,当进程数减少到< 500 几秒钟后,我得到负值,因为计算将是这样的(例如,我选择随机数,只是为了给您简要描述):
Uptime: 155123, No of processes : 500, Stime + Utime total of 500 processes : 3887481923874
Uptime: 155545, No of processes : 390, Stime + Utime total of 390 processes : 2887123343874
如果您看上面的示例,Stime + Utime 的增量将由于进程数量减少,导致负值,并在几毫秒后给出较低的值。我只是想知道,还有其他方法可以计算这样的过程行为吗?谢谢。
I try to make a program with C/C++, to behave like top command in Linux.
I've done some research and already known how to count CPU Usage of a process. We can get the CPU Usage by calculating stime + utime from /proc/[PID]/stat in current time and after several seconds. Then calculate the stime + utime differences and divide the result with uptime differences to, then we get the CPU Usage percentage. It will be so easy on single process/multithread process.
The problem is in the case like httpd, where it works as multiprocess. When the webserver busy, httpd will fork child processes to serve bunch of requests. Then I count the number of total process, let's say 500. I want to calculate the CPU Usage of those processes, but summarize them so I only see 1 httpd CPU Usage. But if I do the algorithm like I've mentioned above, when the number of processes decrease into < 500 after several seconds, I get the negative values, since the calculation will be like this (for example, I choose random number, just to give you brief description):
Uptime: 155123, No of processes : 500, Stime + Utime total of 500 processes : 3887481923874
Uptime: 155545, No of processes : 390, Stime + Utime total of 390 processes : 2887123343874
If you look the the example above, the delta of Stime + Utime will results in negative value, since the number of process decreasing, and give the lower value after few miliseconds. I just want to know, is there any other way to calculate such process behave like this? Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我建议单独保存每个进程的数据。
当您有新样本时,每个过程可能属于以下三个类别之一:
1. 之前和之后都存在 - 从新中减去旧。
2. 现在存在,但之前不存在 - 只需采用新值。
3. 以前存在,但现在不存在——忽略它。你在这里遗漏了一些东西,因为它可能在 90% 的采样周期内使用了 CPU,但我希望你不需要完美的准确性。
它使你在样本之间保留更多数据,并且需要使用更复杂的数据结构,但它应该给出合理的结果。
I suggest keeping the data for each process separately.
When you have a new sample, each process may fall in one of three categories:
1. Existed both before and after - subtract old from new.
2. Exists now, but not before - just take the new values.
3. Existed before, but not now - ignore it. You're missing something here, because it may have used CPU during 90% of your sample period, but I hope you don't need perfect accuracy.
It makes you keep more data between samples, and requires using a more complicated data structure, but it should give reasonable results.
如果您需要准确的结果或者进程的生命周期很短,那么您必须在进程终止时读取进程的时间使用情况。
至少有两种方法:
1) 使用
wait4(2)
或wait3(2)
函数等待进程终止。这些函数将返回进程的 utime 和 stime。2) 将终止的进程保持在僵尸状态,直到读取
/prox//stat
。If you need accurate result or if life time of processes are short, then you must read time usage of a process when it is terminated.
There is atleast two ways:
1) Use
wait4(2)
orwait3(2)
functions to wait process termination. Those functions will return utime and stime of the process.2) Keep terminated processes in zombie state until reading of
/prox/<pid>/stat
.