当前位置：文江博客话题详情

C 代码的微秒分析器

发布于 2024-09-03 02:43:13 字数 52 浏览 10 评论 0原文

有谁知道像 gprof 这样的 C 代码分析器，它以微秒而不是毫秒为单位给出函数调用时间？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

酒绊 2024-09-10 02:43:13

查看 Linux 性能。不过，您将需要一个最新的内核。

回复收藏 0 原文

狂之美人 2024-09-10 02:43:13

假设您有源代码，让我建议如何处理这个问题。

了解函数平均每次调用所花费的时间（包括 I/O），乘以调用次数，再除以总运行时间，即可得出受控制的时间分数该功能的。通过这个分数，您可以知道该函数是否足够花费时间来进行优化。从 gprof 那里获得这些信息并不容易。

了解每个函数的控制下所花费的包含时间的比例的另一种方法是对调用堆栈进行定时或随机采样。如果函数出现在样本的 X 部分中（即使它在样本中出现多次），则 X 是它所花费的时间分数（在误差范围内）。更重要的是，这为您提供了每行的时间片段，而不仅仅是每个函数的时间。

X 部分是您可以获得的最有价值的信息，因为这是您通过优化该函数或代码行可以节省的总时间。

Zoom profiler 是获取此信息的好工具。

我要做的就是在顶级代码周围包装一个长时间运行的循环，以便它重复执行，足够长的时间至少需要几秒钟。然后我会通过随机中断或暂停来手动对堆栈进行采样。实际上只需要很少的样本，比如 10 或 20 个，就可以真正清楚地了解最耗时的函数和/或代码行。

这是一个示例。

PS如果您担心统计准确性，让我量化一下。如果函数或代码行恰好有 50% 的时间位于堆栈上，并且您采集了 10 个样本，则显示该函数或代码行的样本数将为 5 +/- 1.6，误差范围为 16%。如果实际时间更小或更大，误差幅度就会缩小。您还可以通过采集更多样本来减少误差范围。要获得 1.6%，请抽取 1000 个样本。实际上，一旦发现问题，您就可以决定是否需要较小的误差幅度。

Let me just suggest how I would handle this, assuming you have the source code.

Knowing how long a function takes inclusively per invocation (including I/O), on average, multiplied by the number of invocations, divided by the total running time, would give you the fraction of time under the control of that function. That fraction is how you know if the function is a sufficient time-taker to bother optimizing. That is not easy information to get from gprof.

Another way to learn what fraction of inclusive time is spent under the control of each function is timed or random sampling of the call stack. If a function appears on a fraction X of the samples (even if it appears more than once in a sample), then X is the time-fraction it takes (within a margin of error). What's more, this gives you per-line fraction of time, not just per-function.

That fraction X is the most valuable information you can get, because that is the total amount of time you could potentially save by optimizing that function or line of code.

The Zoom profiler is a good tool for getting this information.

What I would do is wrap a long-running loop around the top-level code, so that it executes repeatedly, long enough to take at least several seconds. Then I would manually sample the stack by interrupting or pausing it at random. It actually takes very few samples, like 10 or 20, to get a really clear picture of the most time-consuming functions and/or lines of code.

Here's an example.

P.S. If you're worried about statistical accuracy, let me get quantitative. If a function or line of code is on the stack exactly 50% of the time, and you take 10 samples, then the number of samples that show it will be 5 +/- 1.6, for a margin of error of 16%. If the actual time is smaller or larger, the margin of error shrinks. You can also reduce the margin of error by taking more samples. To get 1.6%, take 1000 samples. Actually, once you've found the problem, it's up to you to decide if you need a smaller margin of error.

回复收藏 0 原文