适用于 Windows 的硬件性能计数器 API
我想使用硬件性能计数器,特别是 x86 CPU 来获取缓存未命中或分支错误预测。性能计数器在 Intel VTune 等高级分析器中大量使用。请不要对 Windows 操作系统上的性能计数器感到困惑。
为了在 C/C++ 程序中使用这些计数器,可以使用 PAPI:http://icl.cs.utk.edu/ papi/
这使您可以轻松使用性能计数器,但仅限 Linux 上。 PAPI 曾经支持 Windows,但现在不支持。
最近有人尝试过 PAPI 或其他 API 在 Windows 上使用硬件性能计数器吗?
I'd like to use hardware performance counter, specifically x86 CPUs to obtain cache misses or branch mis-prediction. Performance counters are heavily used in advanced profilers like Intel VTune. Please don't be confused performance counters on Windows operating systems.
In order to use these counters in C/C++ program, one may use PAPI: http://icl.cs.utk.edu/papi/
This allows you to easily use performance counters, but on only Linux. PAPI once supported Windows, but not now.
Is there anyone who recently tried PAPI or other APIs to use hardware performance counters on Windows?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用 RDPMC 指令或 __readpmc MSVC 编译器内在函数,这是同一件事。
然而,Windows通过将CR4.PCE设置为0来禁止用户模式应用程序执行该指令。大概,这样做是因为每个计数器的含义由MSR寄存器决定,而MSR寄存器只能在内核模式下访问。换句话说,除非您是内核模式模块(例如设备驱动程序),否则如果您尝试执行该指令,您将遇到“特权指令”陷阱。
如果您正在编写用户模式应用程序,您唯一的选择是(正如@Christopher在评论中提到的那样)编写一个内核模块,该模块将为您执行此指令(您将招致用户->内核调用惩罚)并启用在您的计算机上测试签名,以便可以加载您可能的自签名“驱动程序”。这意味着您无法轻松分发此应用程序,但这适用于内部调整。
You can use RDPMC instruction or __readpmc MSVC compiler intrinsic, which is the same thing.
However, Windows prohibits user-mode applications to execute this instruction by setting CR4.PCE to 0. Presumably, this is done because the meaning of each counter is determined by MSR registers, which are only accessible in kernel mode. In other words, unless you're a kernel-mode module (e.g. a device driver), you are going to get "privileged instruction" trap if you attempt to execute this instruction.
If you're writing a user-mode application, your only option is (as @Christopher mentioned in comments) to write a kernel module which would execute this instruction for you (you'll incur user->kernel call penalty) and enable test signing on your machine so your presumably self-signed "driver" can be loaded. This means you can't easily distribute this app, but that'll work for in-house tuning.
这个HCP 参考?它没有提供你想要的吗?
What about this HCP Reference? Does it not provide what you want?