调度程序肯定不会有害吗?我们没有更好的 API 吗?

发布于 2024-12-06 20:16:28 字数 1105 浏览 6 评论 0原文

我想知道有哪些 API 可以用来避免以下问题。

让我回想起我以前的计算机科学课程中的操作系统讲座,主题是多进程调度和并发 I/O。以下是讲师给出的将发生的情况的示例:

两个进程,X 和 Y 有一些工作要做。有一个处理器/总线/无论什么,调度程序天真地在 X 和 Y 之间分配时间片,如下所示:

  • X 获取时间片 1
  • Y 获取时间片 2
  • X 获取时间片 3
  • ...

这被描述为“公平”,但似乎我实在是不公平。考虑此方案下的两种情况

  • 如果 X 和 Y 都需要 10 秒,那么现在两者都需要 20 秒。

  • 如果X需要10秒,Y需要100秒,那么X需要20秒,Y需要110秒。

如果调度程序只是“先执行所有 X,然后执行所有 Y”,那么在第一种情况下,X 将花费 10 秒,Y 将花费 20 秒;在第二种情况下,X 取 10,y 取 110。

一个不让任何人的境况变得更好、而让某些人的境况变得更糟的系统怎么会是一个好主意呢?对“公平”系统有利的唯一论据是,如果我们在任何 X 之前完成所有 Y,那么小作业 X 将被大作业 Y 延迟,并且我们需要保持这两个作业“响应”。

对于第二种情况,我认为自然的“最佳”方式是说“X 小 10 倍,因此如果没有任何明确的偏好,它应该获得 Y 的 10 倍的时间片”。 (这有点像让行人先于汽车先行,因为行人对道路的压力较小,但我离题了。)在这个方案下,X 在 11 秒内完成,Y 在 110 秒内完成。现实世界的结果:即使在后台发生大量文件复制,我的 mp3 加载和播放也没有明显的额外延迟。

显然,有一整套可用的策略,我不想争论任何特定策略的适用性,我的观点是:所有这些策略都需要了解工作的规模。

那么,是否有操作系统 API(Linux,甚至 Windows)允许人们指定一项操作将需要的工作量的提示?

(注意,你可以声称磁盘 I/O 隐式地包含了这一点,但是 while(not_done){read_chunk();} 会使它毫无意义——我正在考虑的 API 会在文件中指定兆字节打开时间、线程创建时的时钟周期或类似的内容。)

I'm wondering what APIs are available to avoid the following problem.

Casting my mind back to Operating System lectures on my old CS course, the topic was multiprocess scheduling and concurrent I/O. Here's what the lecturer gave as an example of what would happen:

Two processes, X and Y have some work to do. There's one processor/bus/whatever and the scheduler distributes timeslices between X and Y, naively, as follows:

  • X gets timeslice 1
  • Y gets timeslice 2
  • X gets timeslice 3
  • ...

This was described as being "fair", however it seems to me grossly unfair. Consider two cases under this scheme

  • If X and Y are both going to take 10 seconds each, now both will take 20 seconds.

  • If X requires 10 seconds and Y requires 100 seconds, then X will take 20 seconds and Y will take 110 seconds.

If the scheduler was simply "do all of X then all of Y" then in the first case X would take 10 seconds and Y would take 20 seconds; in the second case X would take 10 and y would take 110.

How a system which makes nobody better-off and somebody worse-off be a good idea? The only argument in the "fair" system's favour is that if we did all of Y before any of X then a small job X would be delayed by a large job Y and we need to keep both jobs "responsive".

For the second case, part of me sees the natural "best" way as being to say "X is 10 times smaller, therefore absent any explicit preference, it should get 10 times as many timeslices as Y". (It's a bit like giving pedestrians right of way before cars on the grounds that they put less strain on the roads, but I digress.) Under this scheme, X finishes in 11 seconds and Y finishes in 110 seconds. Real world consequence: my mp3 loads and plays without appreciable extra delay even though a massive file copy is happening in the background.

Obviously there is a whole universe of strategies available and I don't want to argue the suitability of any particular one, my point is this: all such strategies require knowledge of the size of the job.

So, are there OS APIs (Linux, or even Windows) which allow one to specify hints of the amount of work an operation will take?

(NB you could claim disk I/O incorporates this implicitly but while(not_done){read_chunk();} would render it meaningless -- the kind of API I'm thinking of would specify megabytes at file open time, clock cycles at thread creation time, or something along these lines.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

倥絔 2024-12-13 20:16:28

如果所有任务都代表在运行完成之前没有任何价值的工作,那么最好的方法是按某种顺序运行所有作业,以便最大限度地减少其他事物(或人)等待它们的成本。在实践中,许多任务代表一系列操作,这些操作可能具有某些单独的价值,因此,如果两项任务各花费 10 秒,则在 10 秒标记处完成两项任务的一半可能比完成一项任务和一项任务更好甚至还没有开始。对于正在生成由另一台机器执行的下游进程所需的数据的任务尤其如此,并且只要下游进程接收到的数据多于其处理的数据,它就能够执行有用的工作。如果工作的一部分需要向人们展示一些有用的事情实际上正在发生,这在某种程度上也是正确的。与进度条在 10 秒内一动不动的用户相比,在 20 秒内观看进度条计数的用户不太可能感到不高兴。

If all tasks represent work that will have no value until they are run to completion, then the best approach is to run all the jobs in some sequence so as to minimize the cost of other things' (or peoples') having to wait for them. In practice, many tasks represent a sequence of operations which may have some individual value, so if two tasks will take ten seconds each, having both tasks be half done at the ten-second mark may be better than having one task completed and one task not even started. This is especially true of tasks are producing data which will be needed by a downstream process which is performed by another machine, and the downstream process will be able to perform useful work any time it has received more data than it has processed. It is also somewhat true if part of the work entails showing a person that something useful is actually happening. A user who watches a progress bar count up over a period of 20 seconds is less likely to get unhappy than one whose progress bar doesn't even budge for ten seconds.

感性不性感 2024-12-13 20:16:28

在常见的操作系统中,您通常不关心任务的延迟,但您会尝试最大化吞吐量 - 在 110 秒内 X 和 Y 都会完成,就这样。当然,某些进程可以是交互式的,因此操作系统需要在进程之间进行上下文切换的额外开销,以保持并行计算的假象。

正如您所说,任何应尽量减少任务完成时间的策略都需要知道需要多长时间。如果任务不仅仅是复制文件,这通常是一个问题 - 这就是为什么有时某些应用程序中的进度条会达到 99%,并在执行最后几件事时停留一段时间。

然而,在实时操作系统中,您通常必须知道任务最坏情况的执行时间或任务必须完成之前的某个截止日期 - 然后您有义务提供此类“提示”。然后,调度程序必须进行更智能的调度(此外,如果包含一些锁或依赖项),在多处理器上,该过程有时是 NP 完全的(然后调度程序使用一些启发式方法)。

我建议您阅读一些有关 RTOS、最早截止日期优先调度和速率单调调度的内容。

In common operating systems you typically don't care about the delay of the task but you try to maximize the throughput - in 110 seconds will both X and Y be done, period. Of course, some of the processes can be interactive and therefore the OS takes the extra overhead of context switches between processes to keep the illusion of computation in parallel.

As you said, any strategy that should minimalize task's completion time would require to know how long it will take. That's very often a problem to find if the task is more than just copy a file - that's why sometimes the progress bar in some application goes to 99% percent and stays there for a while doing just the few last things.

However, in real-time operating systems you often have to know task's worst case execution time or some deadline until the task must be finished - and then you are obligated to provide such "hint". The scheduler must then do a little bit smarter scheduling (moreover if there are some locks or dependencies included), on multiprocessors is the process sometimes NP-complete (then the scheduler uses some heuristics).

I suggest you read something about RTOSes, Earliest Deadline First scheduling and Rate Monotonic scheduling.

内心旳酸楚 2024-12-13 20:16:28

对“公平”系统有利的唯一论点是,如果我们在任何 X 之前完成所有 Y,那么小作业 X 将被大作业 Y 延迟,并且我们需要保持这两个作业“响应”。< /p>

这正是道理。公平调度是公平的,因为它倾向于在请求它的进程之间平均分配计算时间,因此会延迟。

那么,是否有操作系统 API(Linux,甚至 Windows)允许人们指定一项操作将需要的工作量的提示?

批处理系统可以做到这一点,但是,正如您自己总结的那样,这需要了解手头的任务。 Unix/Linux 有 nice 命令,该命令给予进程较低的优先级;让多任务机器上任何长时间运行、受 CPU 限制的进程保持“良好”状态是一个好主意,这样它就不会阻碍短时间和交互式任务。 ionice 对 IO 优先级执行相同的操作。

(此外,自 20 世纪 70 年代初以来,Unix 调度程序动态提高了不会“吃掉”其切片的进程的优先级,因此交互式进程可以获得高 CPU 优先级并保持响应,而不会受到 CPU 限制的进程的影响。参见 Thompson 和Ritchie 关于 Unix 的早期论文。)

The only argument in the "fair" system's favour is that if we did all of Y before any of X then a small job X would be delayed by a large job Y and we need to keep both jobs "responsive".

That's exactly the rationale. Fair scheduling is fair in that it tends to distribute computing time, and therefore delays, equally among processes asking for it.

So, are there OS APIs (Linux, or even Windows) which allow one to specify hints of the amount of work an operation will take?

Batch systems do this, but, as you concluded yourself, this requires knowledge of the task at hand. Unix/Linux has the nice command which gives a process lower priority; it's a good idea to let any long running, CPU-bound process on a multitasking machine be "nice" so it doesn't hold up short and interactive tasks. ionice does the same for IO priority.

(Also, ever since the early 1970s, Unix schedulers have dynamically raised the priority of processes that do not "eat up" their slices, so interactive processes get high CPU priority and stay responsive without CPU-bound ones holding everything up. See Thompson and Ritchie's early papers on Unix.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文