分配算法

发布于 2024-08-01 22:42:21 字数 356 浏览 5 评论 0原文

我需要将 N 个实体(每个实体都有可能的父级和可能的子级)分配给 M 个计算节点,同时满足以下优化条件:

  1. 实体的子级希望分配给同一计算节点(以最大化兄弟节点之间的数据局部性)
  2. 的分布实体应该尽可能均匀(即单个节点不会负担过重)。

我正在寻找一些关于启发式方法的建议来解决这个问题。

我已阅读http://en.wikipedia.org/wiki/Assignment%5Fproblem

谢谢。

I need to assign N entities (each with possible parents and possible children) to M computation nodes while satisfying the following optimization conditions:

  1. Children of an entity want to be assigned to the same computation node (to maximize data locality among siblings)
  2. The distribution of entities should be as even as possible (i.e. no overtaxing of a single node).

I'm looking for some suggestions on heuristic methods to solve this problem.

I've read http://en.wikipedia.org/wiki/Assignment%5Fproblem.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

慵挽 2024-08-08 22:42:21

我不确定 1 是否是硬性要求。 如果是这样,作为第一步,您应该将实体分组为连接的组件。 如果不是,您应该指定 1 和 2 之间的权衡是什么,例如作为成本函数。

如果将每个节点限制为 N/M,则将组件放置在计算节点上就是装箱问题实体。 一个很好的近似是以下算法:

  1. 按实体数量对组件进行排序,按降序
  2. 将它们放置到节点上,只要每个节点
  3. 在完成 2 后仍有可用容量,您可能还有尚未放置的组件。 将它们放置在迄今为止负载最小的节点上。

I'm not sure whether 1 is a hard requirement. If so, as a first step, you should group your entities into connected components. If not, you should specify what the tradeoff between 1 and 2 is, e.g. as a cost function.

Placing the components on computation nodes is the binpacking problem, if you limit each node to N/M entities. A good approximation is the following algorithm:

  1. sort the components by number of entities, in decreasing order
  2. place them onto nodes as long as each node has still capacity available
  3. when done with 2, you may have components that have not been placed. Place those on the nodes which have the smallest load so far.
会傲 2024-08-08 22:42:21

好吧,显然您必须预测每个进程平均有多少个子进程(或简单的总负载)。 与您可以使用经典的分配算法相比,它们通常非常简单。

最重要的问题当然是决定要最小化什么。 通常我们希望最大程度地减少延迟(我们得到多少“落后于计划”),但并非总是如此......

编辑:如果您提前知道所有子级/父级,并且进程的子级必须位于同一台机器中,您可以考虑一个进程及其所有子进程一开始都是同一个进程。 然后你可以使用一个非常简单的算法来最小化你想要最小化的任何东西。

编辑#2:阅读作业调度以获得更好的想法

Well, obviously you have to predict how many children (or simply total load) each process will have on average. Than you can use classic assignment algorithms, they're usually quite simple.

The most important issue is of course to decide what you want to minimize. Usually we want to minimize lateness (how much "behind schedule" we get), but not always...

Edit: If you know all children/parents in advance, and children of a process must be in the same machine, you can consider a process and all its children to be the same process to begin with. Then you can use a very simple algorithm to minimize whatever you want to minimize.

Edit #2: Read about Job Scheduling to get a better idea

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文