如何计算在 Red Hat Enterprise Linux (x86-64) 上执行的指令数量?
我想了解在 Red Hat Enterprise Linux 上运行的程序的给定运行期间执行了多少 x86-64 指令。我知道我可以从 valgrind 获取此信息,但速度减慢相当大。我还知道我们正在使用内置硬件性能计数器的 Intel Core 2 Quad CPU(型号 Q6700)。但我不知道有什么方法可以访问 C 程序中执行的指令总数。
I want to find out how many x86-64 instructions are executed during a given run of a program running on Red Hat Enterprise Linux. I know I can get this information from valgrind but the slowdown is considerable. I also know that we are using Intel Core 2 Quad CPUs (model Q6700) which have hardware performance counters built in. But I don't know of any way to get access to the total number of instructions executed from within a C program.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
性能应用程序编程接口 (PAPI) 似乎是沿着您正在寻找的内容。
来自网站:
田纳西大学创新计算实验室的博士后研究员 Vince Weaver 做了一些 PAPI 相关工作。他在 UTK 的网页上列出的研究看起来可能提供一些额外的信息。
Performance Application Programming Interface (PAPI) appears to be along the lines of what you are looking for.
From the website:
Vince Weaver, a Post Doctoral Research Associate with the Innovative Computing Laboratory at the University of Tennessee, did some PAPI-related work. The research listed on his web page at UTK looks like it may provide some additional information.
libpapi 是您正在寻找的库。
AMD 和 Intel 芯片提供 insn 计数。
libpapi is the library you are looking for.
AMD and Intel chips provide the insn counts.
下面的程序从 C 访问周期计数器寄存器(对不起,不可移植代码,但与 gcc 一起工作正常)。这是用于计数周期的,这与指令不同。现代处理器既可以在同一条指令上使用多个周期,也可以同时执行多个指令。周期通常比指令数量更有趣,但这取决于您的实际目的。
其他性能计数器当然可以以相同的方式访问(实际上我什至不知道是否还有其他方式),但我将不得不寻找实际的指令代码来使用。
}
The program below access to cycles counter register from C (sorry non portable code, but works fine with gcc). This one is for counting cycles, that is not the same thing as instructions. Modern processors can both use several cycles on the same instruction, or execute several instructions at once. Cycles is usually more interresting that number of instructions, but it depends of your actual purpose.
Other performances counter can certainly be accessed the same ways (actually I don't even know if there is others), but I will have to look for the actual instruction code to use.
}
您可以采用多种方法来实现此目的,具体取决于您的需求。如果您只想找出潜在参数的总数,您可以在二进制文件上运行 objdump,这将为您提供程序集。如果您想了解有关程序给定运行中实际指令的更多详细信息,您可能需要查看 DynamoRIO< /a> 提供了该功能。它与 valgrind 类似,但我相信它对性能的影响较小。早在九月份,我就能相对快速、轻松地用它组装一个基本的指令计数器。
如果这还不行,你可以尝试查看 PAPI,这是一个 API,应该可以让你查看处理器上的性能计数器。我从未使用过它,所以我不能代表它,但我的一个朋友大约 6 个月前在一个项目中使用过它,并说他发现它非常有帮助。
There are a couple of ways you could go about it, depending on exactly what you need. If you just want to find out the total number of potential arguments you could just run objdump on the binary, which will give you the assembly. If you want more detailed information about the actual instructions being hit on a given run-through of the program, you may want to look into DynamoRIO which provides that functionality. It is similar to valgrind, but I believe it has a smaller affect on performance. I was able to throw together a basic instruction counter with it back in September relatively quickly and easily.
If that's no good, you could try checking out PAPI, which is an API that should let you get at the performance counters on your processors. I've never used it, so I can't speak for it, but a friend of mine used it in a project about 6 months ago and said he found it very helpful.