“偷工减料”与“工作耸肩”？

发布于 2024-08-27 09:33:24 字数 723 浏览 15 评论 0原文

为什么我可以找到很多关于“工作窃取”的信息，而没有找到关于“工作耸肩”作为动态负载平衡策略的信息？

我所说的“工作耸肩”是指将多余的工作从繁忙的处理器上推到负载较少的邻居上，而不是让空闲的处理器从繁忙的邻居那里拉取工作（“工作窃取”））。

我认为两种策略的总体可扩展性应该是相同的。然而，我相信就延迟和延迟而言，它的效率要高得多。当肯定有工作要做时唤醒空闲处理器，而不是让所有空闲处理器定期轮询所有邻居以查找可能的工作。

无论如何，快速谷歌没有在“工作耸肩”或类似的标题下显示任何内容，因此任何指向现有技术和该策略的行话的指示都将受到欢迎。

澄清

我实际上设想工作提交处理器（可能是也可能不是目标处理器）负责查看首选目标处理器的直接位置（基于数据） /code locality）来决定是否应该给近邻分配新工作，因为他们没有那么多工作要做。

我不认为决策逻辑需要的只是原子读取直接（通常是 2 到 4 个）邻居的估计 q 长度。我不认为这比盗贼轮询和盗贼暗示的耦合程度更高。从邻居那里偷窃。（我假设两种策略中都有“无锁、无等待”队列）。

解决方案

看来我的意思（但仅部分描述！）作为“工作耸肩”策略属于“正常”预先调度策略的领域，这些策略恰好对处理器、缓存和调度策略很聪明。内存忠诚度和可扩展性。

我发现很多参考文献都在搜索这些术语，其中一些看起来非常可靠。当我找到最符合（或推翻！）我对“工作耸肩”的定义的逻辑时，我将发布一个参考。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

傲娇萝莉攻 2024-09-03 09:33:24

负载均衡不是免费的；它具有上下文切换（到内核）、查找空闲处理器以及选择要重新分配的工作的成本。尤其是在任务始终切换（每秒数十次）的机器中，这种成本会增加。

那么有什么区别呢？工作耸肩意味着您会因负载平衡的开销而进一步加重过度配置的资源（繁忙的处理器）的负担。当隔壁的处理器无事可做时，为什么要通过 administrivia 来中断繁忙的处理器呢？另一方面，工作窃取让空闲处理器运行负载平衡器，而繁忙的处理器则继续工作。偷工减料可以节省时间。

示例

考虑：处理器 A 分配有两个任务。它们分别需要时间 a1 和 a2。附近的处理器 B（也许是缓存反弹的距离）处于空闲状态。处理器在所有方面都是相同的。我们假设每个任务和内核的代码都位于两个处理器的 i-cache 中（负载平衡时没有添加页面错误）。

任何类型的上下文切换（包括负载平衡）都需要时间 c。

无负载平衡

完成任务的时间将为 a1 + a2 + c。 处理器 A 将完成所有工作，并在两项任务之间进行一次上下文切换。

工作窃取

假设 B 窃取了 a2，从而导致上下文切换时间本身。这项工作将在 max(a1, a2 + c) 时间内完成。假设处理器 A 开始处理 a1；当它这样做时，处理器 B 将窃取 a2 并避免处理 a1 时出现任何中断。 B 上的所有开销都是空闲周期。

如果a2是较短的任务，那么在这里，您已经有效地隐藏了这种情况下上下文切换的成本；总时间为a1。

工作耸肩

假设B完成a2，如上所述，但A产生了移动它的成本（“耸肩”工作）。本例中的工作将在 max(a1, a2) + c 时间内完成；上下文切换现在总是附加到总时间中，而不是被隐藏。处理器 B 的空闲周期在这里被浪费了；相反，繁忙的处理器 A 将时间交给 B 处理。

Load balancing is not free; it has a cost of a context switch (to the kernel), finding the idle processors, and choosing work to reassign. Especially in a machine where tasks switch all the time, dozens of times per second, this cost adds up.

So what's the difference? Work-shrugging means you further burden over-provisioned resources (busy processors) with the overhead of load-balancing. Why interrupt a busy processor with administrivia when there's a processor next door with nothing to do? Work stealing, on the other hand, lets the idle processors run the load balancer while busy processors get on with their work. Work-stealing saves time.

Example

Consider: Processor A has two tasks assigned to it. They take time a1 and a2, respectively. Processor B, nearby (the distance of a cache bounce, perhaps), is idle. The processors are identical in all respects. We assume the code for each task and the kernel is in the i-cache of both processors (no added page fault on load balancing).

A context switch of any kind (including load-balancing) takes time c.

No Load Balancing

The time to complete the tasks will be a1 + a2 + c. Processor A will do all the work, and incur one context switch between the two tasks.

Work-Stealing

Assume B steals a2, incurring the context switch time itself. The work will be done in max(a1, a2 + c) time. Suppose processor A begins working on a1; while it does that, processor B will steal a2 and avoid any interruption in the processing of a1. All the overhead on B is free cycles.

If a2 was the shorter task, here, you have effectively hidden the cost of a context switch in this scenario; the total time is a1.

Work-Shrugging

Assume B completes a2, as above, but A incurs the cost of moving it ("shrugging" the work). The work in this case will be done in max(a1, a2) + c time; the context switch is now always in addition to the total time, instead of being hidden. Processor B's idle cycles have been wasted, here; instead, a busy processor A has burned time shrugging work to B.

回复收藏 0 原文