按乘积顺序获取列表的每个可能子集的算法,无需构建和排序整个列表(即生成器)
实际上,我有一组具有概率的对象,并且我想查看它们中的每个可能的组,按照假设它们是独立的情况下它们全部为真的可能性的顺序-- 即按子集元素的乘积的降序排列 -- 或者如果概率相同则按长度顺序排列(以便 (1, 0.5) 在 (0.5) 之后)。
示例:如果我有 [ 1, 0.5, 0.1 ]
我想要 [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1 ), (0.5, 0.1), (1, 0.5, 0.1) ]
本质上,这意味着我想按顺序迭代一组元素的幂集,并且我可以相当容易地生成它,对其进行排序,然后完成。然而,幂集变得相当大,速度非常快,我希望我通常会想要第一个子集之一,而且我宁愿不生成数千个子集的列表,对它们进行排序,然后永远不要看第三个子集。这就是 python 生成器有望拯救世界的地方!
问题的更正式规范,我需要找出一种方法来排序(powerset(input),key = lambda l:reduce(lambda(p,n),e:(p * e,n-1) )、l, (1, 0))、reverse=True),作为生成器,或以其他方式避免构建和排序整个列表。
我相当确定这与背包问题以及子集乘积问题有关,但我真的很难找到一个有效的好算法,并且非常感谢您的帮助:-)。在最坏的情况下(一直迭代到最后),它比构建+排序整个事情慢并不是问题,它只需要更好的最佳情况(例如,在前 10% 内)性能。
Practically, I've got a set of objects with probabilities, and I want to look at each possible group of them, in order of how likely it is that they're all true assuming they're independent -- i.e. in descending order of the product of the elements of the subsets -- or in order of length if the probabilities are the same (so that (1, 0.5) comes after (0.5)).
Example: If I have [ 1, 0.5, 0.1 ]
I want [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]
In essence, this means I want to iterate over the powerset of a set of elements in order, and I could fairly easily generate this, sort it, and be done. However, powersets get pretty big pretty fast, I expect I'm usually going to want one of the first subsets, and I'd rather not generate a list of thousands of subsets, sort them, and then never look past the third. This is where python generators hopefully save the day!
More formal specification of the problem, I need to work out a way to do sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True)
, as a generator, or in some other way that lets me avoid building and sorting the entire list.
I'm reasonably sure this is related to the knapsack problem, along with the subset product problem, but I'm really struggling to get a nice algorithm for it that works, and help would be very much appreciated :-). It's not an issue for it to be slower than building + sorting the whole thing in the worst case (iterating all the way to the end), it just needs much better best case (within the first 10%, say) performance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好问题,解决起来相当棘手。我也想不出一种按顺序生成组合的方法,但我使用强大的 heapq (又名优先级队列)来保持候选排序。
Nice question, it was quite tricky to solve. I can't think of a way to generate the combinations in order either, but I wield the mighty
heapq
(aka a priority queue) to keep the candidates sorted.