按活动类型分析程序
典型分析器的输出是代码中的函数列表,按程序运行时每个函数所花费的时间进行排序。
这非常好,但有时我更感兴趣的是程序大部分时间在做什么,而不是 EIP
大部分时间在哪里。
我假设的探查器的一个示例输出是:
Waiting for file IO - 19% of execution time.
Waiting for network - 4% of execution time
Cache misses - 70% of execution time.
Actual computation - 7% of execution time.
有这样的探查器吗?是否有可能从“标准”分析器获得这样的输出?
我使用的是 Linux,但我很高兴听到针对其他系统的任何解决方案。
The output of a typical profiler is, a list of functions in your code, sorted by the amount of time each function took while the program ran.
This is very good, but sometimes I'm interested more with what was program doing most of the time, than with where was EIP
most of the time.
An example output of my hypothetical profiler is:
Waiting for file IO - 19% of execution time.
Waiting for network - 4% of execution time
Cache misses - 70% of execution time.
Actual computation - 7% of execution time.
Is there such a profiler? Is it possible to derive such an output from a "standard" profiler?
I'm using Linux, but I'll be glad to hear any solutions for other systems.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这仅适用于 Solaris,但 dtrace 可以监视几乎所有类型的 I/O、CPU 开/关、特定功能的时间、睡眠时间等。我不确定它是否可以确定缓存未命中,假设您指的是 CPU 缓存- 我不确定 CPU 是否提供该信息。
This is Solaris only, but dtrace can monitor almost every kind of I/O, on/off CPU, time in specific functions, sleep time, etc. I'm not sure if it can determine cache misses though, assuming you mean CPU cache - I'm not sure if that information is made available by the CPU or not.
请查看这里和这个。
考虑任何线程。它在任何时刻都在做某事,并且这样做是有原因的,而缓慢可以定义为它出于不良原因而花费的时间 - 它不需要花那个时间。
在某个时间点拍摄线程的快照。也许是在缓存未命中、指令中、语句中、函数中、从另一个函数中的调用指令调用、从另一个函数调用等等,直到
call _main
。这些步骤中的每一个都有一个原因,通过检查代码就会发现。也许那时磁盘即将到达某个扇区,因此可以启动一些数据流,因此可以填充缓冲区,因此可以在函数中满足读取语句,并且从调用站点调用该函数另一个函数,以及来自另一个函数的函数,依此类推,直到
call _main
,或者任何恰好位于线程顶部的函数。因此,查找瓶颈的方法是查找代码何时因不良原因而花费时间,而查找瓶颈的最佳方法是拍摄其状态快照。 EIP,或者国家的任何其他小部分,都不会这样做,因为它不会告诉你为什么。
很少有分析器“明白”。这样做的是挂钟时间堆栈采样器,它们按代码行(而不是按函数)报告活动时间百分比(不是时间量,尤其不是“自我”或“独占” " 时间。)其中一个是 Zoom,还有其他的。
查看 EIP 挂在哪里就像尝试仅用秒针来判断时钟上的时间一样。测量函数就像试图在缺少一些数字的时钟上辨别时间。仅在 CPU 时间(而非阻塞时间内)进行分析,就像试图在长时间随机停止运行的时钟上读取时间一样。担心测量精度就像试图将午餐时间精确到秒一样。
这并不是一个神秘的话题。
Please take a look at this and this.
Consider any thread. At any instant of time it is doing something, and it is doing it for a reason, and slowness can be defined as the time it spends for poor reasons - it doesn't need to be spending that time.
Take a snapshot of the thread at a point in time. Maybe it's in a cache miss, in an instruction, in a statement, in a function, called from a call instruction in another function, called from another, and so on, up to
call _main
. Every one of those steps has a reason, that an examination of the code reveals.Maybe at that time the disk is coming around to certain sector, so some data streaming can be started, so a buffer can be filled, so a read statement can be satisfied, in a function, and that function is called from a call site in another function, and that from another, and so on, up to
call _main
, or whatever happens to be the top of the thread.So, the way to find bottlenecks is to find when the code is spending time for poor reasons, and the best way to find that is to take snapshots of its state. The EIP, or any other tiny piece of the state, is not going to do it, because it won't tell you why.
Very few profilers "get it". The ones that do are the wall-clock-time stack-samplers that report by line of code (not by function) percent of time active (not amount of time, especially not "self" or "exclusive" time.) One that does is Zoom, and there are others.
Looking at where the EIP hangs out is like trying to tell time on a clock with only a second hand. Measuring functions is like trying to tell time on a clock with some of the digits missing. Profiling only during CPU time, not during blocked time, is like trying to tell time on a clock that randomly stops running for long stretches. Being concerned about measurement precision is like trying to time your lunch hour to the second.
This is not a mysterious subject.