上下文切换有多昂贵？实现手动任务切换是否比依赖操作系统线程更好？

发布于 2024-08-31 23:33:48 字数 1055 浏览 17 评论 0原文

想象一下我有两个（三个、四个等等）必须并行运行的任务。现在，执行此操作的简单方法是创建单独的线程并忘记它。但在普通的老式单核 CPU 上，这将意味着大量的上下文切换 - 我们都知道上下文切换很大、很糟糕、很慢，而且通常是邪恶的。应该避免吧？

在这一点上，如果我从头开始编写软件，我可以加倍努力并实现我自己的任务切换。将每个任务分成几部分，保存中间的状态，然后在单个线程中在它们之间切换。或者，如果我检测到有多个 CPU 核心，我可以将每个任务交给一个单独的线程，一切都会好起来的。

第二种解决方案确实具有适应可用CPU核心数量的优点，但是手动任务切换真的会比OS核心中的任务切换更快吗？特别是如果我试图使用 TaskManager 和 ITask 等使整个事情变得通用？

说明：我是一名 Windows 开发人员，因此我主要对此操作系统的答案感兴趣，但了解其他操作系统也是最有趣的。当您写下答案时，请说明它适用于哪个操作系统。

更多说明：好的，这不是在特定应用程序的上下文中。这确实是一个普遍的问题，是我对可扩展性思考的结果。如果我希望我的应用程序能够扩展并有效地利用未来的 CPU（甚至当今的不同 CPU），我必须使其成为多线程。但是有多少线程呢？如果我创建恒定数量的线程，那么程序在所有核心数量不同的 CPU 上的执行效果将不佳。

理想情况下，线程的数量将在运行时确定，但很少有任务能够真正在运行时拆分为任意数量的部分。然而，许多任务在设计时可以分成相当大的恒定数量的线程。因此，举例来说，如果我的程序可以生成 32 个线程，那么它就已经利用了最多 32 核 CPU 的所有核心，这在未来还很遥远（我认为）。但在简单的单核或双核 CPU 上，这意味着大量的上下文切换，这会减慢速度。

这就是我关于手动任务切换的想法。这样就可以创建 32 个“虚拟”线程，这些线程将被映射到尽可能多的实际线程，并且“上下文切换”将手动完成。问题只是 - 我的手动“上下文切换”的开销会小于操作系统上下文切换的开销吗？

当然，所有这些都适用于受 CPU 限制的进程，例如游戏。对于普通的 CRUD 应用程序来说，这没有什么价值。这样的应用程序最好用一个线程（最多两个）来构建。

原文

Imagine I have two (three, four, whatever) tasks that have to run in parallel. Now, the easy way to do this would be to create separate threads and forget about it. But on a plain old single-core CPU that would mean a lot of context switching - and we all know that context switching is big, bad, slow, and generally simply Evil. It should be avoided, right?

On that note, if I'm writing the software from ground up anyway, I could go the extra mile and implement my own task-switching. Split each task in parts, save the state inbetween, and then switch among them within a single thread. Or, if I detect that there are multiple CPU cores, I could just give each task to a separate thread and all would be well.

The second solution does have the advantage of adapting to the number of available CPU cores, but will the manual task-switch really be faster than the one in the OS core? Especially if I'm trying to make the whole thing generic with a TaskManager and an ITask, etc?

Clarification: I'm a Windows developer so I'm primarily interested in the answer for this OS, but it would be most interesting to find out about other OSes as well. When you write your answer, please state for which OS it is.

More clarification: OK, so this isn't in the context of a particular application. It's really a general question, the result on my musings about scalability. If I want my application to scale and effectively utilize future CPUs (and even different CPUs of today) I must make it multithreaded. But how many threads? If I make a constant number of threads, then the program will perform suboptimally on all CPUs which do not have the same number of cores.

Ideally the number of threads would be determined at runtime, but few are the tasks that can truly be split into arbitrary number of parts at runtime. Many tasks however can be split in a pretty large constant number of threads at design time. So, for instance, if my program could spawn 32 threads, it would already utilize all cores of up to 32-core CPUs, which is pretty far in the future yet (I think). But on a simple single-core or dual-core CPU it would mean a LOT of context switching, which would slow things down.

Thus my idea about manual task switching. This way one could make 32 "virtual" threads which would be mapped to as many real threads as is optimal, and the "context switching" would be done manually. The question just is - would the overhead of my manual "context switching" be less than that of OS context switching?

Naturally, all this applies to processes which are CPU-bound, like games. For your run-of-the-mill CRUD application this has little value. Such an application is best made with one thread (at most two).

分享到QQ

分享到微博