C 代码的微秒分析器

发布于 2024-09-03 02:43:13 字数 52 浏览 10 评论 0原文

有谁知道像 gprof 这样的 C 代码分析器,它以微秒而不是毫秒为单位给出函数调用时间?

Does any body know of C code profiler like gprof which gives function call times in microseconds instead of milliseconds?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

酒绊 2024-09-10 02:43:13

查看 Linux 性能。不过,您将需要一个最新的内核。

Take a look at Linux perf. You will need a pretty recent kernel though.

狂之美人 2024-09-10 02:43:13

假设您有源代码,让我建议如何处理这个问题。

了解函数平均每次调用所花费的时间(包括 I/O),乘以调用次数,再除以总运行时间,即可得出受控制的时间分数该功能的。通过这个分数,您可以知道该函数是否足够花费时间来进行优化。从 gprof 那里获得这些信息并不容易。

了解每个函数的控制下所花费的包含时间的比例的另一种方法是对调用堆栈进行定时或随机采样。如果函数出现在样本的 X 部分中(即使它在样本中出现多次),则 X 是它所花费的时间分数(在误差范围内)。更重要的是,这为您提供了每行的时间片段,而不仅仅是每个函数的时间。

X 部分是您可以获得的最有价值的信息,因为这是您通过优化该函数或代码行可以节省的总时间。

Zoom profiler 是获取此信息的好工具。

我要做的就是在顶级代码周围包装一个长时间运行的循环,以便它重复执行,足够长的时间至少需要几秒钟。然后我会通过随机中断或暂停来手动对堆栈进行采样。实际上只需要很少的样本,比如 10 或 20 个,就可以真正清楚地了解最耗时的函数和/或代码行。

这是一个示例。

PS如果您担心统计准确性,让我量化一下。如果函数或代码行恰好有 50% 的时间位于堆栈上,并且您采集了 10 个样本,则显示该函数或代码行的样本数将为 5 +/- 1.6,误差范围为 16%。如果实际时间更小或更大,误差幅度就会缩小。您还可以通过采集更多样本来减少误差范围。要获得 1.6%,请抽取 1000 个样本。实际上,一旦发现问题,您就可以决定是否需要较小的误差幅度。

Let me just suggest how I would handle this, assuming you have the source code.

Knowing how long a function takes inclusively per invocation (including I/O), on average, multiplied by the number of invocations, divided by the total running time, would give you the fraction of time under the control of that function. That fraction is how you know if the function is a sufficient time-taker to bother optimizing. That is not easy information to get from gprof.

Another way to learn what fraction of inclusive time is spent under the control of each function is timed or random sampling of the call stack. If a function appears on a fraction X of the samples (even if it appears more than once in a sample), then X is the time-fraction it takes (within a margin of error). What's more, this gives you per-line fraction of time, not just per-function.

That fraction X is the most valuable information you can get, because that is the total amount of time you could potentially save by optimizing that function or line of code.

The Zoom profiler is a good tool for getting this information.

What I would do is wrap a long-running loop around the top-level code, so that it executes repeatedly, long enough to take at least several seconds. Then I would manually sample the stack by interrupting or pausing it at random. It actually takes very few samples, like 10 or 20, to get a really clear picture of the most time-consuming functions and/or lines of code.

Here's an example.

P.S. If you're worried about statistical accuracy, let me get quantitative. If a function or line of code is on the stack exactly 50% of the time, and you take 10 samples, then the number of samples that show it will be 5 +/- 1.6, for a margin of error of 16%. If the actual time is smaller or larger, the margin of error shrinks. You can also reduce the margin of error by taking more samples. To get 1.6%, take 1000 samples. Actually, once you've found the problem, it's up to you to decide if you need a smaller margin of error.

夏日落 2024-09-10 02:43:13

gprof 以毫秒或微秒为单位给出结果。我不知道确切的理由,但我的经验是,当它认为有足够的精度时,它会以微秒为单位显示结果。要获得微秒输出,您需要运行程序更长时间和/或没有任何需要花费太多时间运行的例程。

gprof gives results either in milliseconds or in microseconds. I do not know the exact rationale, but my experience is that it will display results in microseconds when it thinks that there is enough precision for it. To get microsecond output, you need to run the program for longer time and/or do not have any routine that takes too much time to run.

痴者 2024-09-10 02:43:13

oprofile 为您提供时钟分辨率(即纳秒)的时间,它生成与 gprof 兼容的输出文件,因此使用起来非常方便。

http://oprofile.sourceforge.net/news/

oprofile gets you times in clock resolution, i.e. nanoseconds, it produces output files compatible with gprof so very convenient to use.

http://oprofile.sourceforge.net/news/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文