线程间负载平衡的启发式算法
我正在开发一个多线程程序,其中有许多工作线程执行长度不等的任务。我想对任务进行负载平衡,以确保它们完成大致相同的工作量。对于每个任务 Ti,我都有一个数字 ci,它可以很好地近似该任务所需的工作量。
我正在寻找一种高效的(O(N) N = 任务数或更好)算法,该算法将在给定 ci 值的情况下为我“大致”提供良好的负载平衡。它不一定是最优的,但我希望能够对结果分配的糟糕程度有一些理论界限。
有什么想法吗?
I'm working on a multi-threaded program where I have a number of worker threads performing tasks of unequal length. I want to load-balance the tasks to ensure that they do roughly the same amount of work. For each task Ti I have a number ci which provides a good approximation to the amount of work that is required for that task.
I'm looking for an efficient (O(N) N = number of tasks or better) algorithm which will give me "roughly" a good load balance given the values of ci. It doesn't have to be optimal, but I would like to be able to have some theoretical bounds on how bad the resulting allocations are.
Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
虽然有关背包问题的建议很有帮助,但您说您正在努力最小化净执行时间。采用背包方法需要您不断增加背包的大小,直到获得可行的解决方案 - 这不是很有效。
如果净执行时间受到所有并行工作线程中最长完成时间的限制,我想分配任务,以便最小化所有线程的最大工作时间。这样做可能会导致一个或多个线程不做很多工作,因此我们并没有真正“平衡”工作。如果你想平衡工作,那就是不同的目标函数。例如,您可能希望最小化线程之间工作的差异。
查看作业车间调度领域。如果您不经常这样做,我建议使用遗传算法 - 如果您必须经常以更自动化的方式这样做,我建议对确定性算法进行一些文献搜索。希望这有帮助。
While the suggestion regarding the knapsack problem is helpful, you said that you are trying to minimize the net time of execution. Taking the knapsack approach would require you to keep increasing your knapsack's sizes until you get a feasible solution - not very efficient.
If the net time of execution is limited by the longest completion time among all threads working in parallel, I want to assign tasks so I MINIMIZE the MAXIMUM work time across all threads. Doing this may result in one or more threads that don't do a lot of work, so we're not really "balancing" the work. If you want to balance the work, then that's a different objective function. For instance, you might want to minimize the variance in work among threads.
Look in the area of job shop scheduling. If you only do this infrequently, I'd suggest using a genetic algorithm - if you have to do it frequently and in a more automated manner, I'd suggest doing a little literature searches for deterministic algorithms. Hope this helps.
您想要实现工作窃取算法。每个工作线程都有一个双端队列,新任务会添加到最小队列的底部。工作人员从自己的队列顶部删除任务(顶部/底部分离减少了争用),当工作人员没有更多工作要做时,它会从最大队列的底部窃取工作。它很简单,而且运行良好,我相信这是微软.net4.0附带的并行系统所基于的算法。
结果分配非常好,只有当整个系统中没有更多可用作业时,工作线程才会无事可做。
铌。如果您想要分解一些示例代码,我的朋友为 C# 编写了一个工作窃取系统,您可以在此处
You want to implement a work stealing algorithm. Each worker thread has a double ended queue, new tasks are added to the bottom of the smallest queue. Workers remove tasks from the top of their own queue (the top/bottom separation reduces contention), when a worker has no more jobs to do, it steals a job off the bottom of the largest queue. It's simple, and works well, this is the algorithm which the Microsoft parallel system coming with .net4.0 is based on I believe.
The resulting allocation is pretty good, worker threads will only be left with no work to do if there are no more jobs available in the entire system.
Nb. If you want some example code to tear apart, my friend wrote a work stealing system for C#, which you can find here
我的倾向不是试图提前弄清楚如何分配任务,而是将它们全部放入一个公共工作队列中。任何无事可做的工作线程都会从队列中获取下一个任务来完成工作并检查队列中的下一个任务。
My inclination is not to try to figure out ahead of time how to assign the tasks, but to throw them all into a common work queue. Any worker thread with nothing else to do grabs the next task from the queue does the work and checks the queue for the next task.
最简单的方法是按 p_i 降序对作业进行排序(但这是 O(n log n))并执行以下操作:
该算法应该为您提供最佳结果,但时间为 O(NM),其中 N 是任务数,M 是线程数。解决方案的总成本为 O(N log N + NM),因此对于 M << N 是 O(N log N),对于 N 附近的 M 来说是 O(n^2)。
Simplest way is to sort jobs descending by p_i (but this is O(n log n)) and do this:
This algorithm should give you best results but with O(NM) time where N is number of tasks and M number of threads. Total cost of solution is O(N log N + NM), so for M << N is O(N log N) and for M near N is O(n^2).
我会看看负载平衡的算法,例如
http://www. ieee802.org/3/trunk_study/february98/congdon.pdf
http://publib.boulder.ibm.com/infocenter/cicstg/v6r0m0/index.jsp?topic=/com.ibm.cicstg600.doc/ccllal0292 .htm
I'd have a look at algorithms for load balancing e.g.
http://www.ieee802.org/3/trunk_study/february98/congdon.pdf
http://publib.boulder.ibm.com/infocenter/cicstg/v6r0m0/index.jsp?topic=/com.ibm.cicstg600.doc/ccllal0292.htm
在 O(N) 中这似乎很容易。
给每个线程一些“点”。让
p_i
点分配给线程T_i
。对于每个任务,选择具有最高p_i
的线程,并从p_i
中减去任务成本。然后,您只需要跟踪按分数排序的线程,这在 O(N) 时间内是微不足道的,并且可以使用平衡树在 O(log N) 中轻松完成。对于连续操作,
p_i
没有最小值。如果您想避免分数走向 -inf,只需定期向所有分数添加任意数量的P
(所有分数的数量相同)。编辑:我得到了错误的N。上面,N是线程数,与提出的问题相反。 N = 任务数,T = 线程数,这会导致 O(N*log T) 成本。如果 T 很“小”,则接近 O(N)。
编辑2:如果提前知道所有任务以及线程数量,那么我认为计算最佳调度类似于背包问题 一般来说,它是 NP 完全问题(所以你会在某个地方得到指数)。正如我上面以某种方式描述的那样,只要所有单独的任务相对于分配给每个线程的总成本而言都有较小的成本,那么基于成本的简单分析将为您提供相对较好的近似值。
In O(N) this seems easy.
Give each thread some "points". Let
p_i
the points allocated to threadT_i
. For each task, choose the thread with the highestp_i
, and subtract the task cost fromp_i
. You then just need to keep track of the threads ordered by score, which is trivial in O(N) time, and can easily be done in O(log N) with a balanced tree.For continuous operation, there is no minimum in
p_i
. If you want to avoid scores going rogue towards -inf, just regularly add an arbitrary amountP
to all scores (the same amount for all scores).Edit: I got the wrong N. Above, N is the number of threads, contrary to the question asked. With N = number of tasks, and T = number of threads, this leads to O(N*log T) cost. If T is "small", this is close to O(N).
Edit 2: If all tasks are known in advance, as well as the number of threads, then I think that computing the optimal scheduling is akin to the knapsack problem and it is, in all generality, NP-complete (so you will get exponentials somewhere). A simple cost-based analysis as I somehow describe above will give you a relatively good approximation as long as all individual tasks have a small cost with regards to the total cost to be allocated to each thread.