Linux 内核中的 Schedule()+switch_to() 函数实际上是如何工作的？

发布于 2024-11-17 21:58:41 字数 1129 浏览 2 评论 0原文

我试图了解 Linux 内核中的调度过程实际上是如何工作的。我的问题不是关于调度算法。它是关于函数 schedule() 和 switch_to() 如何工作的。

我会尝试解释一下。我看到：

当进程用完时间片时，标志 need_resched 由 scheduler_tick() 设置。内核检查该标志，发现它已设置，然后调用 schedule() （与问题 1 相关）以切换到新进程。该标志是一条消息，表明应尽快调用调度，因为另一个进程需要运行。返回用户空间或从中断返回后，将检查 need_resched 标志。如果设置了，内核会在继续之前调用调度程序。

查看内核源码（linux-2.6.10 - 《Linux Kernel Development，第二版》一书所基于的版本），我还看到一些代码可以自动调用 schedule() 函数，赋予另一个进程运行权。我看到函数 switch_to() 是实际执行上下文切换的函数。我研究了一些与架构相关的代码，试图了解 switch_to() 实际在做什么。

这种行为引发了一些我找不到答案的问题：

当 switch_to() 完成时，当前正在运行的进程是什么？调用schedule()的进程？或者下一个进程，即被选择运行的进程？
当 schedule() 被中断调用时，选定的要运行的进程会在中断处理完成时开始运行（在某种 RTE 之后）？或者在那之前？
如果 schedule() 函数无法从中断中调用，那么标志 - need_resched 何时设置？
当定时器中断处理程序工作时，使用什么堆栈？

我不知道我是否可以说清楚。如果我不能，我希望在回答（或问题）后我可以做到这一点。我已经查看了几个来源，试图理解这个过程。我有《Linux Kernel Development, sec ed》一书，我也在使用它。我对 MIP 和 H8300 架构有所了解，如果这有助于解释的话。

原文

I'm trying to understand how the schedule process in linux kernel actually works. My question is not about the scheduling algorithm. Its about how the functions schedule() and switch_to() work.

I'll try to explain. I saw that:

When a process runs out of time-slice, the flag need_resched is set by scheduler_tick(). The kernel checks the flag, sees that it is set, and calls schedule() (pertinent to question 1) to switch to a new process. This flag is a message that schedule should be invoked as soon as possible because another process deserves to run.
Upon returning to user-space or returning from an interrupt, the need_resched flag is checked. If it is set, the kernel invokes the scheduler before continuing.

Looking into the kernel source (linux-2.6.10 - version that the book "Linux Kernel Development, second edition" is based on), I also saw that some codes can call the schedule() function voluntarily, giving another process the right to run.
I saw that the function switch_to() is the one that actually does the context switch. I looked into some architecture dependent codes, trying to understand what switch_to() was actually doing.

That behavior raised some questions that I could not find the answers for :

When switch_to() finishes, what is the current running process? The process that called schedule()? Or the next process, the one that was picked to run?
When schedule() gets called by an interrupt, the selected process to run starts to run when the interrupt handling finishes (after some kind of RTE) ? Or before that?
If the schedule() function can not be called from an interrupt, when is the flag- need_resched set?
When the timer interrupt handler is working, what stack is being used?

I don't know if I could make myself clear. If I couldn't, I hope I can do this after some answers (or questions).
I already looked at several sources trying to understand that process. I have the book "Linux Kernel Development, sec ed", and I'm using it too.
I know a bit about MIPs and H8300 architecture, if that help to explain.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤独患者 2024-11-24 21:58:41

调用 switch_to()< 后/code>，内核堆栈切换到next中指定的任务的内核堆栈。更改地址空间等在 context_switch 中处理().
schedule() 不能在原子上下文中调用，包括从中断中调用（请参阅 schedule_debug()）。如果需要重新安排，则设置 TIF_NEED_RESCHED 任务标志，该标志在中断返回路径。
请参阅 2。
我相信，使用默认的 8K 堆栈，可以使用当前正在执行的任何内核堆栈来处理中断。如果使用 4K 堆栈，我相信会有一个单独的中断堆栈（由于某些 x86 魔法而自动加载），但我并不完全确定这一点。

更详细一点，这里有一个实际的例子：

发生中断。 CPU 切换到中断蹦床例程，该例程将中断号压入堆栈，然后跳转到 common_interrupt
common_interrupt 调用 do_IRQ，其中禁用抢占然后处理 IRQ
在某些时候，一个决定是用来切换任务的。这可能来自定时器中断，或来自唤醒呼叫。无论哪种情况，都会调用 set_task_need_resched ，设置TIF_NEED_RESCHED 任务标志。
最终，CPU 从原始中断中的 do_IRQ 返回，并继续执行 IRQ 退出路径。如果此 IRQ 是从内核内部调用的，则检查是否设置了 TIF_NEED_RESCHED，如果是调用 preempt_schedule_irq，在执行 schedule() 时短暂启用中断。
如果 IRQ 是从用户空间调用的，我们首先检查返回之前是否需要做任何事情。如果是这样，我们转到 retint_careful ，它检查挂起的重新安排（如果需要，直接调用 schedule()）以及检查挂起的信号，然后返回 retint_check 直到没有设置更重要的标志。
最后，我们恢复GS并从中断处理程序返回。

至于switch_to()； switch_to()（在 x86-32 上）的作用是：

保存 EIP（指令指针）和 ESP（堆栈指针）的当前值，以便稍后我们返回到此任务时使用。
切换current_task的值。此时，current 现在指向新任务。
切换到新的堆栈，然后将我们要切换到的任务保存的EIP压入堆栈。稍后会进行返回，以该EIP作为返回地址；这就是它跳回之前调用 switch_to() 的旧代码的
方式调用 __switch_to()。此时，current 指向新任务，并且我们位于新任务的堆栈上，但各种其他 CPU 状态尚未更新。 __switch_to() 处理 FPU、段描述符、调试寄存器等状态的切换。
从 __switch_to() 返回时，switch_to 的返回地址返回到手动压入堆栈的 ()，将执行放回新任务中 switch_to() 之前的位置。现在已完全恢复切换到的任务的执行。

x86-64 非常相似，但由于 ABI 不同，必须稍微多做一些状态保存/恢复操作。

After calling switch_to(), the kernel stack is switched to that of the task named in next. Changing the address space, etc, is handled in eg context_switch().
schedule() cannot be called in atomic context, including from an interrupt (see the check in schedule_debug()). If a reschedule is needed, the TIF_NEED_RESCHED task flag is set, which is checked in the interrupt return path.
See 2.
I believe that, with the default 8K stacks, Interrupts are handled with whatever kernel stack is currently executing. If 4K stacks are used, I believe there's a separate interrupt stack (automatically loaded thanks to some x86 magic), but I'm not completely certain on that point.

To be a bit more detailed, here's a practical example:

An interrupt occurs. The CPU switches to an interrupt trampoline routine, which pushes the interrupt number onto the stack, then jmps to common_interrupt
common_interrupt calls do_IRQ, which disables preemption then handles the IRQ
At some point, a decision is made to switch tasks. This may be from the timer interrupt, or from a wakeup call. In either case, set_task_need_resched is invoked, setting the TIF_NEED_RESCHED task flag.
Eventually, the CPU returns from do_IRQ in the original interrupt, and proceeds to the IRQ exit path. If this IRQ was invoked from within the kernel, it checks whether TIF_NEED_RESCHED is set, and if so calls preempt_schedule_irq, which briefly enables interrupts while performing a schedule().
If the IRQ was invoked from userspace, we first check whether there's anything that needs doing prior to returning. If so, we go to retint_careful, which checks both for a pending reschedule (and directly invokes schedule() if needed) as well as checking for pending signals, then goes back for another round at retint_check until there's no more important flags set.
Finally, we restore GS and return from the interrupt handler.

As for switch_to(); what switch_to() (on x86-32) does is:

Save the current values of EIP (instruction pointer) and ESP (stack pointer) for when we return to this task at some point later.
Switch the value of current_task. At this point, current now points to the new task.
Switch to the new stack, then push the EIP saved by the task we're switching to onto the stack. Later, a return will be performed, using this EIP as the return address; this is how it jumps back to the old code that previously called switch_to()
Call __switch_to(). At this point, current points to the new task, and we're on the new task's stack, but various other CPU state hasn't been updated. __switch_to() handles switching the state of things like the FPU, segment descriptors, debug registers, etc.
Upon return from __switch_to(), the return address that switch_to() manually pushed onto the stack is returned to, placing execution back where it was prior to the switch_to() in the new task. Execution has now fully resumed on the switched-to task.