sched_rr_timeslice_ms对程序性能的影响
考虑以下代码,它将调度策略设置为 SCHED_RR 并执行虚拟循环。
#include <unistd.h>
#include <sched.h>
#include <stdio.h>
int main()
{
int pid_num = getpid();
struct sched_param sp = { .sched_priority = 99 };
int ret = sched_setscheduler(pid_num, SCHED_RR, &sp);
int policy = sched_getscheduler(pid_num);
switch(policy) {
case SCHED_OTHER: printf("SCHED_OTHER\n"); break;
case SCHED_RR: printf("SCHED_RR\n"); break;
case SCHED_FIFO: printf("SCHED_FIFO\n"); break;
default: printf("Unknown...\n");
}
unsigned long long sum = 0;
for (unsigned long long i = 0; i < 30000000000; i++)
sum += i;
printf("%llu\n", sum);
return 0;
}
我使用两个 kernel.sched_rr_timeslice_ms
值(1 和 1000)测试了代码。perf
结果显示:
$ sudo sysctl kernel.sched_rr_timeslice_ms=1
kernel.sched_rr_timeslice_ms = 1
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216
Performance counter stats for 'system wide':
120,100,665,611 instructions # 3.98 insn per cycle
30,160,659,660 cycles
1,717 context-switches
29 cpu-migrations
7.810369637 seconds time elapsed
$ sudo sysctl kernel.sched_rr_timeslice_ms=1000
kernel.sched_rr_timeslice_ms = 1000
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216
Performance counter stats for 'system wide':
120,094,568,266 instructions # 3.98 insn per cycle
30,151,055,338 cycles
1,724 context-switches
23 cpu-migrations
7.726291605 seconds time elapsed
$ uname -r
5.13.0-27-generic
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
相对于 RR 时间片的较大变化,运行时间,上下文切换和其他参数是相同的。
我想知道修改那个内核参数的结果是什么?我预计较大的 RR 时间片至少会导致较少的上下文切换。
Consider the following code which sets the scheduling policy to SCHED_RR and performs a dummy loop.
#include <unistd.h>
#include <sched.h>
#include <stdio.h>
int main()
{
int pid_num = getpid();
struct sched_param sp = { .sched_priority = 99 };
int ret = sched_setscheduler(pid_num, SCHED_RR, &sp);
int policy = sched_getscheduler(pid_num);
switch(policy) {
case SCHED_OTHER: printf("SCHED_OTHER\n"); break;
case SCHED_RR: printf("SCHED_RR\n"); break;
case SCHED_FIFO: printf("SCHED_FIFO\n"); break;
default: printf("Unknown...\n");
}
unsigned long long sum = 0;
for (unsigned long long i = 0; i < 30000000000; i++)
sum += i;
printf("%llu\n", sum);
return 0;
}
I have tested the code with two kernel.sched_rr_timeslice_ms
values, 1 and 1000. The perf
result shows:
$ sudo sysctl kernel.sched_rr_timeslice_ms=1
kernel.sched_rr_timeslice_ms = 1
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216
Performance counter stats for 'system wide':
120,100,665,611 instructions # 3.98 insn per cycle
30,160,659,660 cycles
1,717 context-switches
29 cpu-migrations
7.810369637 seconds time elapsed
$ sudo sysctl kernel.sched_rr_timeslice_ms=1000
kernel.sched_rr_timeslice_ms = 1000
$ perf stat -a -e instructions,cycles,context-switches,cpu-migrations -- sudo ./test
SCHED_RR
7278142215970761216
Performance counter stats for 'system wide':
120,094,568,266 instructions # 3.98 insn per cycle
30,151,055,338 cycles
1,724 context-switches
23 cpu-migrations
7.726291605 seconds time elapsed
$ uname -r
5.13.0-27-generic
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS
Release: 20.04
Codename: focal
With respect to the large change in RR time slice, the run time, context switches and other parameters are the same.
I wonder what is the result of modification of that kernel parameter then? I expect that larger RR time slice results in fewer context switches, at least.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为,当内核选择返回到中断的用户空间任务时(通过计时器中断或时间末端的任何内容),这并不算作上下文开关。我认为这是一个令人惊讶的高上下文开关数量。
无论如何,实时调度并不是要优化整体吞吐量,这是您正在测量的。 (是时候完成这个长期循环的时间了)。
这里重要的因素是,如果您运行更多的线程或进程(任务)会发生什么比实体核心:使用1000ms的圆形旋转时间效果,一旦您所有的内核都被这些任务占据延时整整一秒钟,甚至您的X服务器或终端模拟器。只有内核中断处理程序。
实时系统是关于延迟保证的,而任务可以垄断核心而不会远离核心的时间是重要的调度程序考虑。 (我认为您要与Sched_rr一起运行的大多数任务不是纯粹是从事计算工作,但也可能会频繁地进行I/O,或者至少通过共享内存或信号与其他过程进行交互)
I think it doesn't count as a context-switch when the kernel chooses to return back to the same user-space task that was interrupted (by a timer interrupt or whatever at the end of its timeslice). That's a surprisingly high number of context-switches, I think.
Anyway, realtime scheduling isn't about optimizing for overall throughput, which is what you're measuring. (Time to finish this long-running loop).
The important factor here is what happens if you run more threads or processes (tasks) than you have physical cores: with a 1000ms round-robin timeslice, once all your cores were occupied with those tasks, nothing else in user-space would get a timeslice for a whole second, not even your X server or terminal emulator. Only kernel interrupt handlers.
Real-time systems are about latency guarantees, and how long a task can monopolize a core without being switched away from is an important scheduler consideration. (I think most tasks you'd want to run with SCHED_RR would not be purely doing compute work, but likely doing frequent I/O as well, or at least interacting with other processes via shared memory or signals.)