如何修改 C 程序以便 gprof 可以分析它?
当我在我的 C 程序上运行 gprof 时,它说我的程序没有累积时间,并且所有函数调用都显示 0 时间。但它确实会计算函数调用次数。
如何修改我的程序,以便 gprof 能够计算某项运行所需的时间?
When I run gprof on my C program it says no time accumulated for my program and shows 0 time for all function calls. However it does count the function calls.
How do I modify my program so that gprof will be able to count how much time something takes to run?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
编译的时候有没有指定-pg?
http://sourceware.org/binutils/docs-2.20/gprof/Compiling .html#Compiling
编译完成后,运行该程序,然后对二进制文件运行 gprof。
例如:
test.c:
编译为
cc -pg test.c
,然后运行为a.out
,然后gprof a.out
,给出我你得到什么?
Did you specify -pg when compiling?
http://sourceware.org/binutils/docs-2.20/gprof/Compiling.html#Compiling
Once it is compiled, you run the program and then run gprof on the binary.
E.g.:
test.c:
Compile as
cc -pg test.c
, then run asa.out
, thengprof a.out
, gives meWhat are you getting?
我尝试运行 Kinopiko 的示例,但我将迭代次数增加了 100 倍。
test.c:
然后我采用了 10 stackshots(在 VC 下,但您可以使用pstack)。下面是堆栈截图:
如果不明显,这会告诉您:
简而言之,程序花费大约 100% 的时间将输出缓冲区作为第 7 行 printf 的一部分刷新到磁盘(或控制台)。
(我所说的“一行成本”是指,在该行的请求上花费的总时间的比例,并且大致是包含该行的样本的比例。
如果可以使该行不花费任何时间,例如通过删除它、跳过它或将其工作传递给无限快的协处理器,则该时间分数就是总时间将缩短的量。因此,如果可以避免执行任何这些代码行,时间就会缩短 95% 到 100% 之间。如果您问“递归怎么样?”,答案是没有什么区别。)
现在,也许您想知道其他事情,例如时间是多少例如,在循环中花费。要找出这一点,请删除 printf,因为它一直在占用。也许您想知道纯粹花费在 CPU 时间上而不是系统调用上的时间百分比是多少。要实现这一点,只需丢弃所有未在代码中结束的堆栈截图即可。
我想说的是,如果您正在寻找可以修复的东西以使代码运行得更快,那么 gprof 为您提供的数据,即使您理解它,也几乎没有用处。相比之下,如果您的某些代码导致花费的挂钟时间超出您的预期,堆栈快照将查明它。
I tried running Kinopiko's example, except I increased the number of iterations by a factor of 100.
test.c:
Then I took 10 stackshots (under VC, but you can use pstack). Here are the stackshots:
In case it isn't obvious, this tells you that:
In a nutshell, the program spends ~100% of it's time flushing to disk (or console) the output buffer as part of the printf on line 7.
(What I mean by "Cost of a line" is - it is the fraction of total time spent at the request of that line, and that's roughly the fraction of samples that contain it.
If that line could be made to take no time, such as by removing it, skipping over it, or passing its work off to an infinitely fast coprocessor, that time fraction is how much the total time would shrink. So if the execution of any of these lines of code could be avoided, time would shrink by somewhere in the range of 95% to 100%. If you were to ask "What about recursion?", the answer is It Makes No Difference.)
Now, maybe you want to know something else, like how much time is spent in the loop, for example. To find that out, remove the printf because it's hogging all the time. Maybe you want to know what % of time is spent purely in CPU time, not in system calls. To get that, just throw away any stackshots that don't end in your code.
The point I'm trying to make is if you're looking for things you can fix to make the code run faster, the data gprof gives you, even if you understand it, is almost useless. By comparison, if there is some of your code that is causing more wall-clock time to be spent than you would like, stackshots will pinpoint it.
gprof
的一个问题是:它不适用于动态链接库中的代码。为此,您需要使用sprof
。请参阅此答案:gprof:如何为链接到主程序的共享库中的函数生成调用图One gotcha with
gprof
: it doesn't work with code in dynamically-linked libraries. For that, you need to usesprof
. See this answer: gprof : How to generate call graph for functions in shared library that is linked to main program首先使用
-g
编译您的应用程序,然后检查您使用的 CPU 计数器。如果您的应用程序运行得非常快,那么 gprof 可能会错过所有事件或少于所需的事件(减少要读取的事件数量)。
实际上,分析应该向您显示
CPU_CLK_UNHALTED
或INST_RETIRED
事件,而无需任何特殊开关。但有了这些数据,您只能说明代码的执行情况:INST_RETIRED/CPU_CLK_UNHALTED。尝试使用英特尔 VTune 分析器 - 它免费 30 天并可用于教育。
First compile you application with
-g
, and check what CPU counters are you using.If your application runs very quick than gprof could just miss all events or less that required (reduce the number of events to read).
Actually profiling should show you
CPU_CLK_UNHALTED
orINST_RETIRED
events without any special switches. But with such data you'll be able only to say how well your code it performing: INST_RETIRED/CPU_CLK_UNHALTED.Try to use Intel VTune profiler - it's free for 30 days and for education.