调试 C 程序中的慢速函数(由 gcc 构建)
拥有这样的来源:
void foo() {
func1();
if(qqq) {
func2();
};
func3();
func4();
for(...) {
func5();
}
}
我想获得这样的信息:
void foo() {
5 ms; 2 times; func1();
0 ms; 2 times; if(qqq) {
0 ms; 0 times; func2();
0 ms; 2 times; };
20 ms; 2 times; func3();
5 s ; 2 times; func4();
0 ms; 60 times; for(...) {
30 ms; 60 times; func5();
0 ms; 60 times; }
}
即有关执行该行平均需要多长时间(实际时钟时间,包括在系统调用中等待)以及执行了多少次的信息。
我应该使用什么工具?
我希望该工具能够检测每个函数以测量其运行时间,这是由写入日志文件(或在内存中计数然后转储)的调用函数内部的检测使用的。
Having source like this:
void foo() {
func1();
if(qqq) {
func2();
};
func3();
func4();
for(...) {
func5();
}
}
I want to obtain info like this:
void foo() {
5 ms; 2 times; func1();
0 ms; 2 times; if(qqq) {
0 ms; 0 times; func2();
0 ms; 2 times; };
20 ms; 2 times; func3();
5 s ; 2 times; func4();
0 ms; 60 times; for(...) {
30 ms; 60 times; func5();
0 ms; 60 times; }
}
I.e. information about how long in average it took to execute this line (real clock time, including waiting in syscalls) and how many times is it executed.
What tools should I use?
I expect the tool to instrument each function to measure it's running time, which is used by instrumentation inside calling function that writes log file (or counts in memory and then dumps).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
gprof 是 GNU 构建(gcc、g++)程序的相当标准: http ://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html
这是输出的样子: http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html#SEC5
gprof is pretty standard for GNU built (gcc, g++) programs: http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html
Here is what the output looks like: http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html#SEC5
试运行 Zoom。你不会失望的。
PS 不要指望仪器能完成这项工作。对于行级或函数级信息,假设您绝对不需要精确的调用计数(这与性能几乎没有关系),那么实时堆栈采样器就可以提供所需的信息。
添加:我使用的是 Windows,所以我只是使用 LTProf 运行了您的代码。输出如下所示:
其中每个
func()
执行Sleep(1000)
且qqq
为 True,因此整个过程运行 20秒。左侧的数字是具有该线的样本(6,667 个样本)的百分比。例如,对func
函数之一的单次调用需要 1 秒或总时间的 5%。所以你可以看到调用func5()
的那行占用了总时间的 80%。 (也就是说,20 秒中有 16 秒。)相对而言,堆栈中的所有其他行都非常少,以至于它们的百分比为零。我会以不同的方式呈现信息,但这应该让您了解堆栈采样可以告诉您什么。
Take a trial run of Zoom. You won't be disappointed.
P.S. Don't expect instrumentation to do the job. For either line-level or function-level information, a wall-time stack sampler delivers the goods, assuming you don't absolutely need precise invocation counts (which have little relevance to performance).
ADDED: I'm on Windows, so I just ran your code with LTProf. The output looks like this:
where each
func()
does aSleep(1000)
andqqq
is True, so the whole thing runs for 20 seconds. The numbers on the left are the percent of samples (6,667 samples) that have that line on them. So, for example, a single call to one of thefunc
functions uses 1 second or 5% of the total time. So you can see that the line wherefunc5()
is called uses 80% of the total time. (That is, 16 out of the 20 seconds.) All the other lines were on the stack so little, comparatively, that their percents are zero.I would present the information differently, but this should give a sense of what stack sampling can tell you.
Zoom 或英特尔 VTune。
Either Zoom or Intel VTune.