ac# 探查器如何工作?
我很好奇典型的 C# 分析器是如何工作的?
虚拟机中是否有特殊的钩子?
扫描字节码以进行函数调用并注入调用以启动/停止计时器是否容易?
或者这真的很难,这就是人们花钱购买工具来做到这一点的原因?
(作为旁注,我发现有点有趣,因为它是如此罕见 - 谷歌完全错过了搜索“ac# profiler 是如何工作的?" 根本不起作用 - 结果与空调有关...)
I'm curious how does a typical C# profiler work?
Are there special hooks in the virtual machine?
Is it easy to scan the byte code for function calls and inject calls to start/stop timer?
Or is it really hard and that's why people pay for tools to do this?
(as a side note i find a bit interesting bec it's so rare - google misses the boat completely on the search "how does a c# profiler work?" doesn't work at all - the results are about air conditioners...)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Microsoft 有一个免费的 CLR Profiler,版本 4.0。
https://www.microsoft.com/downloads/en/details.aspx?FamilyID=be2d842b-fdce-4600-8d32-a3cf74fda5e1
顺便说一句,CLR Profiler 文档中有一个很好的部分详细描述了它的工作原理,第 103 页。源代码是发行版的一部分。
There is a free CLR Profiler by Microsoft, version 4.0.
https://www.microsoft.com/downloads/en/details.aspx?FamilyID=be2d842b-fdce-4600-8d32-a3cf74fda5e1
BTW, there's a nice section in the CLR Profiler doc that describes how it works, in detail, page 103. There's source as part of distro.
注入调用非常困难,需要工具来完成。
这不仅很难,而且是一种非常间接的查找瓶颈的方法。
原因是瓶颈是代码中的一个或少量语句,这些语句占用了很大一部分时间,这些时间可以显着减少 - 即它不是真正必要的,即它是浪费的。
如果您可以知道某个例程的平均包含时间(包括 IO 时间),并且如果您可以将其乘以它被调用的次数,然后除以总时间,您就可以知道该例程所占时间的百分比需要。
如果百分比很小(比如 10%),那么您可能在其他地方遇到更大的问题。
如果百分比较大(例如 20% 到 99%),则例程内可能会出现瓶颈。
因此,现在您必须在例程中寻找它,查看它调用的内容以及它们花费的时间。此外,您还希望避免被递归(调用图的 bugaboo)所迷惑。
有一些分析器(例如用于 Linux、Shark 等的 Zoom)以不同的原理工作。
原理是存在一个函数调用堆栈,并且在例程负责的所有时间内(执行工作或等待其他例程执行其请求的工作),它都位于堆栈上。
因此,如果它负责 50% 的时间(比如说),那么这就是它在堆栈上的时间量,
无论调用了多少次,或者每次调用花费了多少时间。
不仅例程在堆栈上,而且花费时间的特定代码行也在堆栈上。
你不需要去寻找他们。
您不需要的另一件事是测量精度。
如果您采集 10,000 个堆栈样本,则测量出的有罪线为 50 +/- 0.5%。
如果您抽取 100 个样本,则测量结果为 50 +/- 5%。
如果您抽取 10 个样本,则测量结果为 50 +/- 16%。
在任何情况下,你都会找到它们,这就是你的目标。
(递归并不重要。它的意思是给定的行可以在给定的堆栈样本中出现多次。)
在这个主题上,存在很多混乱。无论如何,对于查找瓶颈最有效的分析器是那些根据挂钟时间对堆栈进行采样并按行报告百分比的分析器。 (如果正确看待某些有关分析的神话,就很容易看出这一点。)
Injecting calls is hard enough that tools are needed to do it.
Not only is it hard, it's a very indirect way to find bottlenecks.
The reason is what a bottleneck is is one or a small number of statements in your code that are responsible for a good percentage of time being spent, time that could be reduced significantly - i.e. it's not truly necessary, i.e. it's wasteful.
IF you can tell the average inclusive time of one of your routines (including IO time), and IF you can multiply it by how many times it has been called, and divide by the total time, you can tell what percent of time the routine takes.
If the percent is small (like 10%) you probably have bigger problems elsewhere.
If the percent is larger (like 20% to 99%) you could have a bottleneck inside the routine.
So now you have to hunt inside the routine for it, looking at things it calls and how much time they take. Also you want to avoid being confused by recursion (the bugaboo of call graphs).
There are profilers (such as Zoom for Linux, Shark, & others) that work on a different principle.
The principle is that there is a function call stack, and during all the time a routine is responsible for (either doing work or waiting for other routines to do work that it requested) it is on the stack.
So if it is responsible for 50% of the time (say), then that's the amount of time it is on the stack,
regardless of how many times it was called, or how much time it took per call.
Not only is the routine on the stack, but the specific lines of code costing the time are also on the stack.
You don't need to hunt for them.
Another thing you don't need is precision of measurement.
If you took 10,000 stack samples, the guilty lines would be measured at 50 +/- 0.5 percent.
If you took 100 samples, they would be measured as 50 +/- 5 percent.
If you took 10 samples, they would be measured as 50 +/- 16 percent.
In every case you find them, and that is your goal.
(And recursion doesn't matter. All it means is that a given line can appear more than once in a given stack sample.)
On this subject, there is lots of confusion. At any rate, the profilers that are most effective for finding bottlenecks are the ones that sample the stack, on wall-clock time, and report percent by line. (This is easy to see if certain myths about profiling are put in perspective.)
1)不存在“典型”这样的东西。人们通过多种方式收集配置文件信息:对 PC 进行时间采样、检查堆栈跟踪、捕获方法/语句/编译指令的执行计数、在代码中插入探测器以收集计数以及选择性地调用上下文以获取调用上下文上的配置文件数据基础。这些技术中的每一种都可以以不同的方式实现。
2) 有分析“C#”和分析“CLR”。在 MS 世界中,您可以分析 CLR 并将 CLR 指令位置反向翻译为 C# 代码。我不知道Mono是否使用相同的CLR指令集;如果没有,那么您就无法使用 MS CLR 探查器;你必须使用 Mono IL 分析器。或者,您可以检测 C# 源代码来收集分析数据,然后在 MS、Mono 或某人的 C# 兼容自定义编译器上编译/运行/收集该数据,或者在嵌入式系统(例如 WinCE)中运行的 C#(空间宝贵且空间有限)像 CLR 内置函数这样的功能往往会被排除在外。
检测源代码的一种方法是使用源到源的转换,将代码从其初始状态映射到包含数据收集代码以及原始程序的代码。这篇有关通过检测代码来收集测试覆盖率数据的论文展示了程序转换系统如何用于通过在执行代码块时插入设置特定于块的布尔标志的语句来插入测试覆盖率探针。计数分析器用计数器增量指令替代这些探针。时序分析器为这些探针插入时钟快照/增量计算。我们的 C# Profiler 对 C# 源代码实现计数和计时分析;它还通过使用收集执行路径的更复杂的探测器来收集调用图数据。因此它可以通过这种方式在调用图上生成计时数据。该方案适用于任何您可以获得一半不错的分辨率时间值的地方。
1) There's no such thing as "typical". People collect profile information by a variety of means: time sampling the PC, inspecting stack traces, capturing execution counts of methods/statements/compiled instructions, inserting probes in code to collect counts and optionally calling contexts to get profile data on a call-context basis. Each of these techniques might be implemented in different ways.
2) There's profiling "C#" and profiling "CLR". In the MS world, you could profile CLR and back-translate CLR instruction locations to C# code. I don't know if Mono uses the same CLR instruction set; if they did not, then you could not use the MS CLR profiler; you'd have to use a Mono IL profiler. Or, you could instrument C# source code to collect the profiling data, and then compile/run/collect that data on either MS, Mono, or somebody's C# compatible custom compiler, or C# running in embedded systems such as WinCE where space is precious and features like CLR-built-ins tend to get left out.
One way to instrument source code is to use source-to-source transformations, to map the code from its initial state to code that contains data-collecting code as well as the original program. This paper on instrumenting code to collect test coverage data shows how a program transformation system can be used to insert test coverage probes by inserting statements that set block-specific boolean flags when a block of code is executed. A counting-profiler substitutes counter-incrementing instructions for those probes. A timing profiler inserts clock-snapshot/delta computations for those probes. Our C# Profiler implements both counting and timing profiling for C# source code both ways; it also collect the call graph data by using more sophisticated probes that collect the execution path. Thus it can produce timing data on call graphs this way. This scheme works anywhere you can get your hands on a halfway decent resolution time value.
这是一篇讨论仪器和采样方法的长篇文章的链接:
http://smartbear。 com/support/articles/aqtime/profiling/
This is a link to a lengthy article that discusses both instrumentation and sampling methods:
http://smartbear.com/support/articles/aqtime/profiling/