贪婪的多个背包（最小化/减少垃圾箱的数量）

发布于 2024-11-07 20:19:19 字数 3015 浏览 0 评论 0原文

实际上，我已经对这个问题有了部分答案，但我想知道这一小段贪婪代码是否可以推广到更接近最佳解决方案的东西。

我是如何遇到这个问题的（与问题本身无关，但可能很有趣）：
我收到大量对象（这是一组堤坝轮廓，每个堤坝沿其长度或多或少保持相同的形状），我可以根据属性（堤坝名称）对它们进行分组。我的程序的输出发送到一个外部程序，我们必须手动调用该程序（不要问我为什么），并且该程序无法从故障中恢复（一个错误会导致整个批次停止）。
在我使用它的应用程序中，对垃圾箱的数量和垃圾箱的最大尺寸没有硬性要求，我尝试做的是
保持较低的组数量（调用程序几次。）
保持较小的集合（减少批次失败时的损坏）
将类似的事情放在一起（一个团队的失败可能是整个团队的失败）。

我没有太多时间，所以我写了一个小的贪心函数，将集合分组在一起。

一位同事建议我可以在数据中添加一些噪声来探索我找到的近似解的邻域，我们想知道找到的解离最佳解有多远。

并不是说它与原始任务相关，不需要真正的最佳解决方案，但我想我应该与社区分享这个问题，看看会产生什么评论。

def group_to_similar_sizes(orig, max_size=None, max_factor=None):
    """group orig list in sections that to not overflow max(orig) (or given max_size).

    return list of grouped indices, plus max effective length.

    >>> group_to_similar_sizes([1, 3, 7, 13])
    ([[2, 1, 0], [3]], 13)
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ])
    ([[3, 1], [4, 2], [5], [6, 0], [7], [8]], 22)

    result is independent of original ordering
    >>> group_to_similar_sizes([9, 19, 22, 12, 19, 9, 2, 22, 11, ])
    ([[3, 1], [4, 2], [5], [6, 0], [7], [8]], 22)

    you can specify a desired max size
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ], 50)
    ([[3, 2, 1], [6, 5, 4], [8, 7, 0]], 50)

    if the desired max size is too small, it still influences the way we make groups.
    >>> group_to_similar_sizes([1, 3, 7, 13], 8)
    ([[1], [2, 0], [3]], 13)
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ], 20)
    ([[1], [3, 2], [4, 0], [5], [6], [7], [8]], 22)

    max size can be adjusted by a multiplication factor
    >>> group_to_similar_sizes([9, 19, 22, 12, 5, 9, 2, 22, 11, ], max_factor=0.75)
    ([[2], [3], [4, 1], [5, 0], [6], [7], [8]], 22)
    >>> group_to_similar_sizes([9, 19, 22, 12, 5, 9, 2, 22, 11, ], max_factor=1.5)
    ([[2, 1], [6, 5], [7, 3, 0], [8, 4]], 33)
    """

    ordered = sorted(orig)
    max_size = max_size or ordered[-1]
    if max_factor is not None:
        max_size = int(max_size * max_factor)

    orig_ordered = list(ordered)
    todo = set(range(len(orig)))
    effective_max = 0

    result = []
    ## while we still have unassigned items
    while ordered:
        ## choose the largest item
        ## make it member of a group
        ## check which we can still put in its bin

        candidate_i = len(ordered) - 1
        candidate = ordered.pop()
        if candidate_i not in todo:
            continue
        todo.remove(candidate_i)

        group = [candidate_i]
        group_size = candidate

        for j in sorted(todo, reverse=True):
            if orig_ordered[j] + group_size <= max_size:
                group.append(j)
                group_size += orig_ordered[j]
                todo.remove(j)

        result.insert(0, group)
        effective_max = max(group_size, effective_max)

    return result, effective_max

原文

actually, I already have a partial answer for this question, but I'm wondering if this small piece of greedy code can be generalized to something closer to the optimal solution.

how I met this problem (not relevant for problem itself, but maybe interesting):
I receive a large collection of objects (it's a set of profiles of dykes, and each dyke keeps more or less the same shape along its length), I can group them according to a property (the name of the dyke). the output of my program goes to an external program that we have to invoke by hand (don't ask me why) and which can't recover from failures (one mistake stops the whole batch).
in the application where I'm using this, there's no hard requirement on the amount of bins nor to the maximum size of the bins, what I try to do is to
keep the amount of groups low (invoke the program few times.)
keep the sets small (reduce the damage if a batch fails)
keep similar things together (a failure in a group is probably a failure in the whole group).

I did not have much time, so I wrote a small greedy function that groups sets together.

a colleague suggested I could add some noise to the data to explore the neighbourhood of the approximate solution I find, and we were wondering how far from optimal are the solutions found.

not that it is relevant to the original task, which doesn't need a true optimal solution, but I thought I would share the question with the community and see what comments come out of it.

def group_to_similar_sizes(orig, max_size=None, max_factor=None):
    """group orig list in sections that to not overflow max(orig) (or given max_size).

    return list of grouped indices, plus max effective length.

    >>> group_to_similar_sizes([1, 3, 7, 13])
    ([[2, 1, 0], [3]], 13)
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ])
    ([[3, 1], [4, 2], [5], [6, 0], [7], [8]], 22)

    result is independent of original ordering
    >>> group_to_similar_sizes([9, 19, 22, 12, 19, 9, 2, 22, 11, ])
    ([[3, 1], [4, 2], [5], [6, 0], [7], [8]], 22)

    you can specify a desired max size
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ], 50)
    ([[3, 2, 1], [6, 5, 4], [8, 7, 0]], 50)

    if the desired max size is too small, it still influences the way we make groups.
    >>> group_to_similar_sizes([1, 3, 7, 13], 8)
    ([[1], [2, 0], [3]], 13)
    >>> group_to_similar_sizes([2, 9, 9, 11, 12, 19, 19, 22, 22, ], 20)
    ([[1], [3, 2], [4, 0], [5], [6], [7], [8]], 22)

    max size can be adjusted by a multiplication factor
    >>> group_to_similar_sizes([9, 19, 22, 12, 5, 9, 2, 22, 11, ], max_factor=0.75)
    ([[2], [3], [4, 1], [5, 0], [6], [7], [8]], 22)
    >>> group_to_similar_sizes([9, 19, 22, 12, 5, 9, 2, 22, 11, ], max_factor=1.5)
    ([[2, 1], [6, 5], [7, 3, 0], [8, 4]], 33)
    """

    ordered = sorted(orig)
    max_size = max_size or ordered[-1]
    if max_factor is not None:
        max_size = int(max_size * max_factor)

    orig_ordered = list(ordered)
    todo = set(range(len(orig)))
    effective_max = 0

    result = []
    ## while we still have unassigned items
    while ordered:
        ## choose the largest item
        ## make it member of a group
        ## check which we can still put in its bin

        candidate_i = len(ordered) - 1
        candidate = ordered.pop()
        if candidate_i not in todo:
            continue
        todo.remove(candidate_i)

        group = [candidate_i]
        group_size = candidate

        for j in sorted(todo, reverse=True):
            if orig_ordered[j] + group_size <= max_size:
                group.append(j)
                group_size += orig_ordered[j]
                todo.remove(j)

        result.insert(0, group)
        effective_max = max(group_size, effective_max)

    return result, effective_max

分享到QQ

分享到微博