如何在 ubuntu 中分析 TLB 命中和 TLB 未命中
我编写了一个简单的 C++ 程序,使用 for 循环打印从 1 到 100 的数字。我想找到特定程序在运行时发生的 TLB 命中和未命中的数量。有没有可能获得这些数据?
我正在使用Ubuntu。我使用过 perf 工具。但它在不同的时期产生不同的结果。我很困惑我的代码的哪一部分导致了如此大量的 TLB 命中、TLB 未命中和缓存未命中。
当然,可能还有其他进程同时运行,例如 Ubuntu GUI。但是,这个结果是否也包含了那些过程呢? 我使用的命令: perf stat -e dTLB-loads -e dTPerformance counter stats for './hellocc':
结果:第一次 -
909,822 dTLB-loads
2,023 dTLB-misses # 0.22% of all dTLB cache hits
4,512 cache-misses
0.006821182 seconds time elapsed
LB-misses ./hellocc
结果:第二次 - Performance counter stats for './hellocc' :
907,810 dTLB-loads
2,045 dTLB-misses # 0.23% of all dTLB cache hits
4,533 cache-misses
0.006780635 seconds time elapsed
我的简单代码:
#include <iostream>
using namespace std;
int main
{
cout << "hello" << "\n";
for(int i=1; i <= 100; i = i + 1)
cout<< i << "\t" ;
return 0;
}
I have written a simple C++ program using for-loop to print the numbers from 1 to 100. I want to find the number of TLB hits and misses occurring for the particular program while running. Is there any possibility to get this data?
I am using Ubuntu. I have used perf tool. But it is producing different result in different times. I am very confused what part of my code is leading to such a huge number TLB hits, TLB misses and cache misses.
Ofcourse there might be other processes running simultaneously like Ubuntu GUI. But, does this result includes those process too?
command I used: perf stat -e dTLB-loads -e dTPerformance counter stats for './hellocc':
result: first time--
909,822 dTLB-loads
2,023 dTLB-misses # 0.22% of all dTLB cache hits
4,512 cache-misses
0.006821182 seconds time elapsed
LB-misses ./hellocc
result: Second time-- Performance counter stats for './hellocc':
907,810 dTLB-loads
2,045 dTLB-misses # 0.23% of all dTLB cache hits
4,533 cache-misses
0.006780635 seconds time elapsed
My simple code:
#include <iostream>
using namespace std;
int main
{
cout << "hello" << "\n";
for(int i=1; i <= 100; i = i + 1)
cout<< i << "\t" ;
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
模拟这种情况的一种方法是使用 cachegrind,它是 valgrind 的一部分。
虽然这不是您的硬件(我认为您无法使用),但它是一个很好的替代品。
One way you could simulate this is using cachegrind, a part of valgrind.
While it's not your hardware, which I don't think you can get to, it's a good stand-in.
程序的缓存行为取决于当时系统上发生的其他情况。
在 Linux 系统上,有许多进程在运行,例如 X 服务器和窗口管理器、终端、编辑器、各种守护进程以及正在运行的任何其他进程(例如 Web 浏览器)。
根据调度程序的变幻莫测以及这些其他程序对您的系统提出的要求,您的程序的数据可能会也可能不会保留在缓存中(调度程序甚至可能将您的进程完全分页到交换文件),因此缓存未命中的数量将根据正在运行的其他应用程序而有所不同。
The cache behaviour of your program depends on what else is happening on your system at the time.
On a Linux system there are many processes running, such as the X server and window manager, the terminal, your editor, various daemon processes, and whatever else you have running (such as a web browser).
Depending on the vagaries of the scheduler, and the demands these other programs place on your system, your program's data may or may not stay in cache (the scheduler may even page your process entirely to the swap file), so the number of cache misses will vary depending on the other applications running.