仪器关闭时 callgrind 缓慢
我正在使用 callgrind 来分析 Linux 多线程应用程序,并且大多数情况下它工作得很好。我在关闭仪器的情况下启动它(--instr-atstart=no),然后一旦设置完成,我就使用 callgrind_control -i on 将其打开。但是,当我更改某些配置以尝试分析应用程序的不同部分时,即使在我打开检测之前,它也开始运行得非常慢。基本上,正常操作需要几秒钟的代码部分在 callgrind 下需要一个多小时(仪器关闭)。关于为什么会这样以及如何调试/解决缓慢问题有什么想法吗?
I am using callgrind to profile a linux multi-threaded app and mostly it's working great. I start it with instrumentation off (--instr-atstart=no) and then once setup is done i turn it on with callgrind_control -i on. However, when I change certain configurations to try to profile a different part of the app it starts running extremely slow even before I turn instrumentation on. Basically part of the code that would take a few seconds with normal operation takes over an hour with callgrind (instrumentation turned off). Any ideas as to why that might be and how to go about debugging/resolving the slowness?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Callgrind 是一个基于 valgrind 构建的工具。 Valgrind 基本上是一个动态二进制翻译器(libVEX,valgrind 的一部分)。它将解码每条指令并将它们即时编译为同一CPU的某些指令流。
据我所知,没有办法为已经运行的进程启用此翻译(在 valgrind 实现中),因此从程序启动时一直启用动态翻译。也无法关闭。
工具是通过添加一些检测代码在 valgrind 上构建的。 “Nul”工具(nulgrind)是不添加仪器的工具。但每个工具都使用 valgrind,并且动态翻译始终处于活动状态。在 callgrind 中打开和关闭只是打开和关闭附加仪器。
由 Valgrind 实现的虚拟 CPU 是有限的,有(不完整的)限制列表 http://valgrind.org/docs/manual/manual-core.html#manual-core.limits">http:// /valgrind.org/docs/manual/manual-core.html#manual-core.limits 大多数限制与浮点运算有关,并且它们可能会被错误地模拟。
变化与浮点运算有关吗?或者还有其他列出的限制?
您还应该知道,“Valgrind 会序列化执行,以便一次只有一个线程运行”。 (来自同一页面manual-core.html)
Callgrind is a tool, built on valgrind. Valgrind is basically a dynamic binary translator (libVEX, part of valgrind). It will decode every instruction and JIT-compile them into stream of some instructions of the same CPU.
As I know, there is no way to enable this translation (in valgrind implementation) for already running process, so dynamic translation is enabled all time, from start of program. It can't be turned off too.
Tools are built on valgrind by adding some instrumentation code. The "Nul" tool (nulgrind) is the tool which adds no instrumentation. But every tool uses valgrind and dynamic translation is active all time. Turning on and off in callgrind is just turning on and off additional instrumentation.
Virtual CPU, implemented by Valgrind is limited, there is (incomplete) list of limitations http://valgrind.org/docs/manual/manual-core.html#manual-core.limits Most of limitations are about floating point operations, and they can be emulated wrong.
Is the change connected with floating-point operations? Or with other listed limitations?
Also you should know, that "Valgrind serialises execution so that only one thread is running at a time". (from the same page manual-core.html)