按乘积顺序获取列表的每个可能子集的算法,无需构建和排序整个列表(即生成器)

发布于 2024-10-19 03:32:40 字数 697 浏览 7 评论 0原文

实际上,我有一组具有概率的对象,并且我想查看它们中的每个可能的组,按照假设它们是独立的情况下它们全部为真的可能性的顺序-- 即按子集元素的乘积的降序排列 -- 或者如果概率相同则按长度顺序排列(以便 (1, 0.5) 在 (0.5) 之后)。

示例:如果我有 [ 1, 0.5, 0.1 ] 我想要 [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1 ), (0.5, 0.1), (1, 0.5, 0.1) ]

本质上,这意味着我想按顺序迭代一组元素的幂集,并且我可以相当容易地生成它,对其进行排序,然后完成。然而,幂集变得相当大,速度非常快,我希望我通常会想要第一个子集之一,而且我宁愿不生成数千个子集的列表,对它们进行排序,然后永远不要看第三个子集。这就是 python 生成器有望拯救世界的地方!

问题的更正式规范,我需要找出一种方法来排序(powerset(input),key = lambda l:reduce(lambda(p,n),e:(p * e,n-1) )、l, (1, 0))、reverse=True),作为生成器,或以其他方式避免构建和排序整个列表。

我相当确定这与背包问题以及子集乘积问题有关,但我真的很难找到一个有效的好算法,并且非常感谢您的帮助:-)。在最坏的情况下(一直迭代到最后),它比构建+排序整个事情慢并不是问题,它只需要更好的最佳情况(例如,在前 10% 内)性能。

Practically, I've got a set of objects with probabilities, and I want to look at each possible group of them, in order of how likely it is that they're all true assuming they're independent -- i.e. in descending order of the product of the elements of the subsets -- or in order of length if the probabilities are the same (so that (1, 0.5) comes after (0.5)).

Example: If I have [ 1, 0.5, 0.1 ] I want [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]

In essence, this means I want to iterate over the powerset of a set of elements in order, and I could fairly easily generate this, sort it, and be done. However, powersets get pretty big pretty fast, I expect I'm usually going to want one of the first subsets, and I'd rather not generate a list of thousands of subsets, sort them, and then never look past the third. This is where python generators hopefully save the day!

More formal specification of the problem, I need to work out a way to do sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True), as a generator, or in some other way that lets me avoid building and sorting the entire list.

I'm reasonably sure this is related to the knapsack problem, along with the subset product problem, but I'm really struggling to get a nice algorithm for it that works, and help would be very much appreciated :-). It's not an issue for it to be slower than building + sorting the whole thing in the worst case (iterating all the way to the end), it just needs much better best case (within the first 10%, say) performance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深巷少女 2024-10-26 03:32:40

好问题,解决起来相当棘手。我也想不出一种按顺序生成组合的方法,但我使用强大的 heapq (又名优先级队列)来保持候选排序。

from heapq import heappush, heappop
import operator

def prob(ps):
    """ returns the probability that *not* all ps are True """
    return 1-reduce(operator.mul, ps)

def gen(ps):
    # turn each to a tuple
    items = ((x,) for x in sorted(ps, reverse=True))

    # create a priority queue, sorted by probability
    pq = [(prob(x),x) for x in items]

    # because you wanted this
    yield ()

    # as long as there are valid combinations
    while pq:
        # get the best un-yielded combination, the pq makes sure of that
        p, x = heappop(pq)
        yield x

        # generate all the combinations from this item
        for other in ps:

            # keeping the tuples sorted -> unique combinations
            if other < x[-1]:

                # create a new combination
                new = x+(other,)
                item = prob(new), new

                # add it to the queue
                heappush(pq,item)


a = [1, 0.1, 0.5] 
print list(gen(a))

Nice question, it was quite tricky to solve. I can't think of a way to generate the combinations in order either, but I wield the mighty heapq (aka a priority queue) to keep the candidates sorted.

from heapq import heappush, heappop
import operator

def prob(ps):
    """ returns the probability that *not* all ps are True """
    return 1-reduce(operator.mul, ps)

def gen(ps):
    # turn each to a tuple
    items = ((x,) for x in sorted(ps, reverse=True))

    # create a priority queue, sorted by probability
    pq = [(prob(x),x) for x in items]

    # because you wanted this
    yield ()

    # as long as there are valid combinations
    while pq:
        # get the best un-yielded combination, the pq makes sure of that
        p, x = heappop(pq)
        yield x

        # generate all the combinations from this item
        for other in ps:

            # keeping the tuples sorted -> unique combinations
            if other < x[-1]:

                # create a new combination
                new = x+(other,)
                item = prob(new), new

                # add it to the queue
                heappush(pq,item)


a = [1, 0.1, 0.5] 
print list(gen(a))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文