webkit 分析器

发布于 2024-09-25 02:17:34 字数 193 浏览 4 评论 0原文

什么是“自我”和“总计”列? “总计”列加起来并没有达到 100%(高得多),而且看起来就像是 self 一样。我怀疑 self 是非累积的,而 Total 是。因此,如果方法 A 调用方法 B 调用方法 C,则在 self Id 中会分别看到每个方法调用的百分比,而总计方法 A 将显示所有三个方法的总和,方法 B 将显示 2,依此类推。

这是正确的吗?

what are the 'self' and 'total' columns? The 'total' column does not add up to 100% (much higher) and it looks like the self does. I suspect self is non-cumulative and total is. So if methodA calls methodB calls methodC, in self Id see the % for each method call individually, whereas in total methodA would show the total of all three methods, methodB would show the 2, and so on.

Is this correct?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

箜明 2024-10-02 02:17:34

假设你有这样的程序:

main()调用A()调用B()调用C(),并且C在循环中挂起10秒。
CPU-profiler 会这样说:

total time: 10 sec
routine   self%  inclusive%
   main      0         100
   A         0         100
   B         0         100
   C       100         100

C 的自时间将是 10 秒,100%。其他人的自拍时间基本上为零。

其中每一项的总时间(含)将为 10 秒或 100%。你不用把这些加起来。

另一方面,假设 C 花费 10 秒进行 I/O。
然后,仅 CPU 分析器会说这样的话:

total time: 0 sec
routine   self%  inclusive%
   main      ?           ?
   A         ?           ?
   B         ?           ?
   C         ?           ?

因为它使用的唯一实际 CPU 时间非常短,基本上没有样本命中它,所以为了得到百分比,它除以零。

OTOH 如果样本是在挂钟时间上的,您将得到第一个输出。

更好的探查器类型是根据挂钟时间对调用堆栈进行采样,并告诉您包含时间占总时间的百分比,并在代码行级别(而不仅仅是函数)将其提供给您。这很有用,因为它可以直接衡量如果执行较少的行可以节省多少,并且几乎没有问题可以隐藏。此类分析器的示例包括 ZoomLTProf,我被告知 OProfile 可以做到这一点。有一个简单的方法它适用于任何语言并且只需要一个调试器。

以下是对这些问题的讨论。

Suppose you have this program:

main() calls A() calls B() calls C(), and C hangs in a loop for 10 seconds.
The CPU-profiler would say something like this:

total time: 10 sec
routine   self%  inclusive%
   main      0         100
   A         0         100
   B         0         100
   C       100         100

The self time of C would be 10 seconds, 100%. The self time of the others would be essentially zero.

The total (inclusive) time of every one of them would be 10 seconds or 100%. You don't add those up.

On the other hand, suppose C spends its 10 seconds doing I/O.
Then the CPU-only profiler would say something like this:

total time: 0 sec
routine   self%  inclusive%
   main      ?           ?
   A         ?           ?
   B         ?           ?
   C         ?           ?

because the only actual CPU time it uses is so short that basically no samples hit it, so to get the percents it is dividing by zero.

OTOH if the samples were on wall-clock time, you would get the first output.

A better type of profiler is one that samples the call stack, on wall clock time and tells you inclusive time as a percent of total, and gives it to you at the line-of-code level, not just for functions. That's useful because it's a direct measure of how much could be saved if the line were executed less, and almost no problem can hide from it. Examples of such profilers are Zoom and LTProf, and I'm told OProfile can do it. There's a simple method that works with any language and requires only a debugger.

Here's a discussion of the issues.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文