查找上下文切换的数量

发布于 2025-01-20 02:28:05 字数 1836 浏览 2 评论 0原文

为了衡量多线程应用程序的上下文开关数量,我遵循了两种方法:1)使用perf Sched和2)在/proc/proc/pid/state < /代码>。不过,差异很大。我所做的步骤是:

1-使用perf命令,开关数为7848。

$ sudo perf stat -e sched:sched_switch,task-clock,context-switches,cpu-migrations,page-faults,cycles,instructions ./mm_double_omp 4

Using 4 threads
PID = 395944

Performance counter stats for './mm_double_omp 4':

         7,601      sched:sched_switch        #    0.044 K/sec
    173,377.19 msec task-clock                #    3.973 CPUs utilized
         7,601      context-switches          #    0.044 K/sec
             2      cpu-migrations            #    0.000 K/sec
        24,780      page-faults               #    0.143 K/sec
 164,393,781,352      cycles                    #    0.948 GHz
 69,723,515,498      instructions              #    0.42  insn per cycle

  43.636463582 seconds time elapsed

 173.244505000 seconds user
   0.123880000 seconds sys

请注意,sched:sched_switch上下文转换是相同的。如果我仅使用sched:sched_switch该号码仍然按7000

。程序的开始和结束。

int main() {
   char cmdbuf[256];
   int pid_num = getpid();
   printf("PID = %d\n", pid_num);
   snprintf(cmdbuf, sizeof(cmdbuf), "sudo cp /proc/%d/status %s", pid_num, "start.txt" );
   system(cmdbuf);
   // DO
   snprintf(cmdbuf, sizeof(cmdbuf), "sudo cp /proc/%d/status %s", pid_num, "finish.txt" );
   system(cmdbuf);
   return 0;
}

执行后,我看到:

$ tail -n2  start.txt
voluntary_ctxt_switches:        2
nonvoluntary_ctxt_switches:     0
$ tail -n2  finish.txt
voluntary_ctxt_switches:        5
nonvoluntary_ctxt_switches:     573

因此,少于600个上下文开关远低于perf结果。问题是:

  1. perf代码会影响测量吗?如果是,那么它的开销很大。

  2. 上下文开关的含义在这两种方法中都是相同的吗?

  3. 哪个更可靠?

In order to measure the number of context switches for a multi-thread application, I followed two methods: 1) with perf sched and 2) with the information in /proc/pid/status. The difference is quite large, though. The steps I did are:

1- Using perf command, the number of switches is 7848.

$ sudo perf stat -e sched:sched_switch,task-clock,context-switches,cpu-migrations,page-faults,cycles,instructions ./mm_double_omp 4

Using 4 threads
PID = 395944

Performance counter stats for './mm_double_omp 4':

         7,601      sched:sched_switch        #    0.044 K/sec
    173,377.19 msec task-clock                #    3.973 CPUs utilized
         7,601      context-switches          #    0.044 K/sec
             2      cpu-migrations            #    0.000 K/sec
        24,780      page-faults               #    0.143 K/sec
 164,393,781,352      cycles                    #    0.948 GHz
 69,723,515,498      instructions              #    0.42  insn per cycle

  43.636463582 seconds time elapsed

 173.244505000 seconds user
   0.123880000 seconds sys

Please note that sched:sched_switch and context-switches are the same. If I only use sched:sched_switch the number is still in the order of 7000.

2- I modified the code to copy /proc/pid/status file two times: At the beginning and finish of the program.

int main() {
   char cmdbuf[256];
   int pid_num = getpid();
   printf("PID = %d\n", pid_num);
   snprintf(cmdbuf, sizeof(cmdbuf), "sudo cp /proc/%d/status %s", pid_num, "start.txt" );
   system(cmdbuf);
   // DO
   snprintf(cmdbuf, sizeof(cmdbuf), "sudo cp /proc/%d/status %s", pid_num, "finish.txt" );
   system(cmdbuf);
   return 0;
}

After the execution I see:

$ tail -n2  start.txt
voluntary_ctxt_switches:        2
nonvoluntary_ctxt_switches:     0
$ tail -n2  finish.txt
voluntary_ctxt_switches:        5
nonvoluntary_ctxt_switches:     573

So, there are less than 600 context switches which is far less than the perf result. Questions are:

  1. Does perf code affect the measurement? If yes, then it has a large overhead.

  2. Is the meaning of context switch is the same in both methods?

  3. Which one is more reliable then?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文