GCC 的“-pg”如何?标记与分析器相关的工作?

发布于 2024-12-02 15:28:47 字数 383 浏览 4 评论 0原文

我试图了解使用 GCC 编译 C 代码时 -pg (或 -p)标志如何工作。

GCC 官方文档仅声明

<代码>-pg 生成额外的代码来编写适合分析程序 gprof 的配置文件信息。在编译您想要数据的源文件时必须使用此选项,并且在链接时也必须使用它。

这真的让我很感兴趣,因为我正在对分析器进行一些小研究。我正在尝试选择最适合这项工作的工具。

I'm trying to understand how the -pg (or -p) flag works when compiling C code with GCC.

The official GCC documentation only states:

-pg
Generate extra code to write profile information suitable for the analysis program gprof. You must use this option when compiling the source files you want data about, and you must also use it when linking.

This really interests me, as I'm doing a small research on profilers. I'm trying to pick the best tool for the job.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

再见回来 2024-12-09 15:28:47

使用 -pg 编译您的代码,以便Gprof 报告详细信息。请参阅 gprof 手册,9.1 分析实施

分析的工作原理是更改程序中每个函数的编译方式,以便在调用它时,它会隐藏一些有关其调用位置的信息。由此,分析器可以找出哪个函数调用了它,并可以计算它被调用的次数。当使用 -pg 选项编译程序时,编译器会进行此更改,这会导致每个函数调用 mcount (或 _mcount,或 __mcount,具体取决于操作系统和编译器)作为其第一个操作之一。

分析库中包含的 mcount 例程负责在内存调用图表中记录其父例程(子例程)及其父例程的父例程。这通常是通过检查堆栈帧以查找子级的地址以及原始父级中的返回地址来完成的。由于这是一个非常依赖于机器的操作,mcount 本身通常是一个简短的汇编语言存根例程,用于提取所需的信息,然后调用 __mcount_internal(一个普通的 C 函数) )有两个参数 - frompcselfpc__mcount_internal 负责维护内存中的调用图,其中记录了 frompcselfpc 以及这些调用弧中每一个的次数已遍历。

...

请注意,使用这样的检测分析器,您将分析与在发布中编译的相同代码,而无需分析检测。检测代码本身会产生一定的开销。此外,检测代码可能会改变指令和数据缓存的使用。

与仪器分析器相反,采样分析器如 Intel VTune< /a> 通过使用操作系统中断定期查看目标程序的程序计数器来处理非检测代码。它还可以查询特殊的 CPU 寄存器,让您更深入地了解正在发生的情况。

另请参阅分析器检测与采样

Compiling with -pg instruments your code, so that Gprof reports detailed information. See gprof's manual, 9.1 Implementation of Profiling:

Profiling works by changing how every function in your program is compiled so that when it is called, it will stash away some information about where it was called from. From this, the profiler can figure out what function called it, and can count how many times it was called. This change is made by the compiler when your program is compiled with the -pg option, which causes every function to call mcount (or _mcount, or __mcount, depending on the OS and compiler) as one of its first operations.

The mcount routine, included in the profiling library, is responsible for recording in an in-memory call graph table both its parent routine (the child) and its parent's parent. This is typically done by examining the stack frame to find both the address of the child, and the return address in the original parent. Since this is a very machine-dependent operation, mcount itself is typically a short assembly-language stub routine that extracts the required information, and then calls __mcount_internal (a normal C function) with two arguments—frompc and selfpc. __mcount_internal is responsible for maintaining the in-memory call graph, which records frompc, selfpc, and the number of times each of these call arcs was traversed.

...

Please note that with such an instrumenting profiler, you're profiling the same code you would compile in release without profiling instrumentation. There is an overhead associated with the instrumentation code itself. Also, the instrumentation code may alter instruction and data cache usage.

Contrary to an instrumenting profiler, a sampling profiler like Intel VTune works on noninstrumented code by looking at the target program's program counter at regular intervals using operating system interrupts. It can also query special CPU registers to give you even more insight of what's going on.

See also Profilers Instrumenting Vs Sampling.

‖放下 2024-12-09 15:28:47

此链接简要说明了 gprof 的工作原理。

此链接对其进行了广泛的批评。
(查看我对已存档问题的回答。)

This link gives a brief explanation of how gprof works.

This link gives an extensive critique of it.
(Check my answer to the archived question.)

戈亓 2024-12-09 15:28:47

来自测量Ftrace 的函数持续时间"

仪器主要有两种
形式——显式声明的跟踪点和隐式跟踪点。

显式跟踪点由开发人员定义的
指定位置的声明
跟踪点以及有关数据的附加信息
应在特定的跟踪站点收集。隐含的
由于编译器标志或开发人员对常用宏的重新定义,跟踪点由编译器自动放入代码中。

隐式地检测函数,当
内核配置为支持函数跟踪,内核构建系统添加 -pg 到使用的标志
编译器。这会导致编译器将代码添加到
每个函数的序言,它调用一个名为 mcount 的特殊汇编例程。这个编译器选项是
专门用于分析和跟踪
目的。

From "Measuring Function Duration with Ftrace":

Instrumentation comes in two main
forms—explicitly declared tracepoints, and implicit tracepoints.

Explicit tracepoints consist of developer defined
declarations which specify the location of the
tracepoint, and additional information about what data
should be collected at a particular trace site. Implicit
tracepoints are placed into the code automatically by the compiler, either due to compiler flags or by developer redefinition of commonly used macros.

To instrument functions implicitly, when
the kernel is configured to support function tracing, the kernel build system adds -pg to the flags used with
the compiler. This causes the compiler to add code to
the prologue of each function, which calls a special assembly routine called mcount. This compiler option is
specifically intended to be used for profiling and tracing
purposes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文