We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(5)
您可以使用 GNU 实用程序 GCOV 来执行此操作逐行分析。从 GCC 文档< /a> .
文件 tmp.c.gcov 包含如下输出:
You can use use the GNU utility GCOV to do line by line profiling. Sample run from GCC Docs .
The file tmp.c.gcov contains output like:
我相信 callgrind 可以做到这一点。我知道它确实对每行进行循环计数,但我不确定“时间”。
I believe callgrind does that. I know it does cycle counts per line, but I'm not sure about 'time.'
Shark 是 Mac OS X 中的分析工具之一,可以做到这一点(甚至可以通过指令进行分析)。我意识到您的屏幕截图是在 Windows 上进行的,因此这可能没有帮助,但也许您可以在 Mac 上运行您的代码。您可以尝试非常困,但我从未使用过它,所以不知道它有多好。
Shark, one of the profiling tools in Mac OS X, can do that (or even profile by instruction). I realise that your screenshot is on Windows so that may not be helpful, but perhaps you can run your code on a Mac. You could try Very Sleepy, but I've never used it so have no idea how good it is.
检查此链接和尝试这个方法。
像 Mandelbrot 这样的例子的问题在于它不是一个很大的程序。在现实世界的软件中,调用树变得更深、更茂密,因此您需要找出每行或指令,它负责的时间百分比是多少,而这只是它在调用中的时间百分比堆。因此,您需要对调用堆栈进行采样并告诉您,对于出现在此处的每一行或指令,它所占的样本百分比是多少。您不需要高精度的测量——这是神话之一。
有一些工具可以执行此操作,一个是 RotateRight/Zoom,另一个是 LTProf。我个人发誓采用完全手动的方法。
在过去的几天里,我们这里的一些代码遇到了性能问题。通过手动方法,我找到了一种可以节省 40% 的方法。然后我找到了一种可以节省 40% 的方法,总共节省了 64%。这只是一个例子。 这是节省超过 97% 的示例。
补充:这会产生社会影响,可能会限制潜在的加速。假设存在三个问题。问题 A(在您的代码中)花费了 1/2 的时间。问题 B(在 Jerry 的代码中)花费了 1/4 的时间,问题 C(在您的代码中)花费了 1/8 的时间。当你进行采样时,问题 A 突然出现在你面前,因为它是你的代码,所以你修复了它,现在程序花费的时间是原来的 1/2。然后你再次采样,问题 B(现在是 1/2)突然出现在你面前。你看到它在 Jerry 的代码中,所以你必须向 Jerry 解释它,尽量不要让他难堪,并询问他是否可以修复它。如果他出于某种原因不这样做(比如这是他最喜欢的一些代码),那么即使你解决了问题 C,时间也只能减少到原来时间的 3/8。如果他确实修复了它,你可以修复 C 并将时间减少到原来的 1/8。然后可能还有另一个问题 D(你的),如果你解决了它,时间可以减少到原来时间的 1/16,但如果 Jerry 不解决问题 B,你就不能做得比 5/16 更好了。这就是社交互动在性能调整中绝对关键的原因。
我见过的唯一有效的技巧(因为它用在我身上)就是以悲伤、歉意的语气呈现信息,就好像这是你的问题,并坚持呈现信息。歉意的语气化解了尴尬,坚持让他不断思考。
Check this link and try this method.
The trouble with an example like Mandelbrot is that it is not a very big program. In real-world software the call tree gets much deeper and way more bushy, so you need to find out, per line or instruction, what percent of time it is responsible for, and that is just the percent of time it is on the call stack. So, you need something that samples the call stack and tells you, for each line or instruction that appears there, what percent of samples it is on. You don't need high precision of measurement - that is one of the myths.
There are tools that do this, one is RotateRight/Zoom, and another is LTProf. Personally I swear by the totally manual method.
Over the last couple days, we had a performance problem in some code around here. By the manual method, I found one way to save 40%. Then I found a way to save 40% on top of that, for a total saving of 64%. That's just one example. Here's an example of saving over 97%.
Added: There are social implications of this that can limit the potential speedup. Suppose there are three problems. Problem A (in your code) takes 1/2 of the time. Problem B (in Jerry's code) takes 1/4 of the time, and problem C (in your code) takes 1/8 of the time. When you sample, problem A jumps out at you, and since it's your code, you fix it, and now the program takes 1/2 the original time. Then you sample again, and problem B (which is now 1/2) jumps out at you. You see that it is in Jerry's code, so you have to explain it to Jerry, trying not to embarrass him, and ask him if he could fix it. If he doesn't for whatever reason (like that was some of his favorite code) then even if you fix problem C, time could only be reduced to 3/8 of the original time. If he does fix it, you can fix C and get down to 1/8 of the original time. Then there could be another problem D (yours) that if you fix it could get the time down to 1/16 of the original time, but if Jerry doesn't fix problem B you can't do any better than 5/16. That is how social interaction can be absolutely critical in performance tuning.
The only technique I've seen that works (because it was used on me) is to present the information in a sorrowful, apologetic tone, as if it were your problem, and be persistent about presenting the information. The apologetic tone defuses the embarassment, and the persistence keeps him thinking about it.
我们的 SD C Profiler 工具可与 GCC 源代码配合使用。它提供基本块而不是线的分析;这以相当低的开销提供了相同的精确信息。
Our SD C Profiler tool works with GCC source code. It provides profiling of basic blocks rather than lines; this gives the same exact information with considerably lower overhead.