gprof 是否考虑了阻塞时间?
我在可执行文件上运行 gprof,但可执行文件花费大量时间等待子进程完成。 gprof 计时是否考虑了等待时间?
I am running gprof on my executable, but the executable spends a lot of time wait()
ing for child processes to complete. Is the time spent waiting factored in to the gprof timings?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我没有太多使用 gprof,但据我所知,
wait
和每个看到的子进程都没有被分析。看一个简单的例子:
gprof
输出是(在我的机器上):如果你真的想分析子线程/线程,我建议此作为起点。
I haven't used gprof much, but to my knowledge, neither the
wait
nor the child processes per see are profiled.See a simple example:
The
gprof
output for this is (on my machine):If you actually want to profile the childs/threads, I'd suggest this as a starting point.
似乎有一个选项可以记录分叉进程, 这篇 ibm 文章对此进行了一些讨论。
同一篇文章建议尝试tprof,它在使用上与gprof类似,但在下使用不同的方法该引擎盖可以为多进程/多线程应用程序提供更准确的图像。
It seems there is an option to log fork'ed processes, this ibm article talks about it a bit.
The same article recommends trying tprof, it is similar to gprof in use, but uses different methods under the hood that might give a more accurate picture for multi-process/multi-thread applications.
gprof 仅计算进程中的实际 CPU 时间。效果更好的是对调用堆栈进行采样,并在挂钟时间而不是CPU时间进行采样。当然,在等待用户输入时不应采集样本(或者如果采集了样本,则应将其丢弃)。一些分析器可以完成所有这些,例如 RotateRight/Zoom,或者您可以使用 pstack 或 lsstack,但这里有一个简单的方法。
gprof only counts actual CPU time in your process. What works a lot better is something that samples the call stack, and samples it on wall-clock time, not CPU time. Of course, samples should not be taken while waiting for user input (or if they are taken, they should be discarded). Some profilers can do all this, such as RotateRight/Zoom, or you can use pstack or lsstack, but here's a simple way to do it.