如何获取集合的所有子集? (动力组)
给定一个集合,
{0, 1, 2, 3}
我怎样才能产生子集:
[set(),
{0},
{1},
{2},
{3},
{0, 1},
{0, 2},
{0, 3},
{1, 2},
{1, 3},
{2, 3},
{0, 1, 2},
{0, 1, 3},
{0, 2, 3},
{1, 2, 3},
{0, 1, 2, 3}]
Given a set
{0, 1, 2, 3}
How can I produce the subsets:
[set(),
{0},
{1},
{2},
{3},
{0, 1},
{0, 2},
{0, 3},
{1, 2},
{1, 3},
{2, 3},
{0, 1, 2},
{0, 1, 3},
{0, 2, 3},
{1, 2, 3},
{0, 1, 2, 3}]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(30)
Python
itertools
页面一个powerset
配方:输出:
如果您不喜欢开头的空元组,您可以将
range
语句更改为range(1, len(s)+1)
以避免 0 长度的组合。The Python
itertools
page has exactly apowerset
recipe for this:Output:
If you don't like that empty tuple at the beginning, you can just change the
range
statement torange(1, len(s)+1)
to avoid a 0-length combination.这是 powerset 的更多代码。这是从头开始写的:
Mark Rushakoff 的评论适用于此:“如果您不喜欢开头的空元组,则。”您只需将 range 语句更改为 range(1, len(s)+1) 即可避免 0 长度组合”,除非在我的情况下,您将
for i in range(1 << x)
更改为for i in range(1, 1 << x) )
几年后,我现在会这样写:
然后测试代码将如下所示:
使用
yield
意味着您不需要计算所有结果都在一块内存中,在主循环之外预先计算掩码被认为是值得优化的。Here is more code for a powerset. This is written from scratch:
Mark Rushakoff's comment is applicable here: "If you don't like that empty tuple at the beginning, on."you can just change the range statement to range(1, len(s)+1) to avoid a 0-length combination", except in my case you change
for i in range(1 << x)
tofor i in range(1, 1 << x)
.Returning to this years later, I'd now write it like this:
And then the test code would look like this, say:
Using
yield
means that you do not need to calculate all results in a single piece of memory. Precalculating the masks outside the main loop is assumed to be a worthwhile optimization.如果您正在寻找快速答案,我刚刚在谷歌上搜索了“python power set”并得出了这个:Python Power Set Generator
这是该页面中代码的复制粘贴:
可以像这样使用:
现在 r 是您的所有元素的列表想要的,并且可以排序和打印:
If you're looking for a quick answer, I just searched "python power set" on google and came up with this: Python Power Set Generator
Here's a copy-paste from the code in that page:
This can be used like this:
Now r is a list of all the elements you wanted, and can be sorted and printed:
使用函数
powerset()
来自包more-itertools
。如果您想要集合,请使用:
Use function
powerset()
from packagemore-itertools
.If you want sets, use:
我发现以下算法非常清晰和简单:
生成幂集的另一种方法是生成具有 n 位的所有二进制数。作为幂集,
n
位数字的数量为2 ^ n
。该算法的原理是,子集中可能存在或不存在元素,因为二进制数字可以是一或零,但不能同时是两者。我在学习 MITx: 6.00.2x 计算思维和数据科学简介时发现了这两种算法,我认为这是我见过的最容易理解的算法之一。
I have found the following algorithm very clear and simple:
Another way one can generate the powerset is by generating all binary numbers that have
n
bits. As a power set the amount of number withn
digits is2 ^ n
. The principle of this algorithm is that an element could be present or not in a subset as a binary digit could be one or zero but not both.I found both algorithms when I was taking MITx: 6.00.2x Introduction to Computational Thinking and Data Science, and I consider it is one of the easiest algorithms to understand I have seen.
powerset 有一个细化:
There is a refinement of powerset:
TL;DR(直接进入简化)
我知道我之前添加了一个答案,但我真的很喜欢我的新实现。我将一组作为输入,但它实际上可以是任何可迭代的,并且我返回一组集合,这是输入的幂集。我喜欢这种方法,因为它更符合幂集(set所有子集)。
如果您想要确切地在答案中发布的输出,请使用以下内容:
解释
众所周知,幂集的元素数量为
2 ** len(A)
,因此可以清楚地看到for 循环。我需要将输入(最好是集合)转换为列表,因为集合是唯一无序元素的数据结构,并且顺序对于生成子集至关重要。
选择器
是该算法的关键。请注意,选择器的长度与输入集的长度相同,为了实现这一点,它使用带填充的 f 字符串。基本上,这允许我选择在每次迭代期间添加到每个子集的元素。假设输入集有 3 个元素{0, 1, 2}
,因此选择器将采用 0 到 7(含)之间的值,其二进制为:因此,每个位都可以用作指示符是否应添加原始集合的元素。查看二进制数,只需将每个数字视为超集的一个元素,其中
1
表示应添加索引j
处的元素,而0
表示不应添加此元素。我使用集合理解在每次迭代时生成一个子集,并将该子集转换为
frozenset
,以便我可以将其添加到ps
(幂集)。否则,我将无法添加它,因为 Python 中的集合仅包含不可变对象。简化
您可以使用一些 python 推导式来简化代码,这样您就可以摆脱那些 for 循环。您还可以使用
zip
来避免使用j
索引,代码最终将如下所示:就是这样。我喜欢这个算法的原因是它比其他算法更清晰、更直观,因为依赖
itertools
看起来很神奇,尽管它按预期工作。TL;DR (go directly to Simplification)
I know I have previously added an answer, but I really like my new implementation. I am taking a set as input, but it actually could be any iterable, and I am returning a set of sets which is the power set of the input. I like this approach because it is more aligned with the mathematical definition of power set (set of all subsets).
If you want exactly the output you posted in your answer use this:
Explanation
It is known that the number of elements of the power set is
2 ** len(A)
, so that could clearly be seen in thefor
loop.I need to convert the input (ideally a set) into a list because by a set is a data structure of unique unordered elements, and the order will be crucial to generate the subsets.
selector
is key in this algorithm. Note thatselector
has the same length as the input set, and to make this possible it is using an f-string with padding. Basically, this allows me to select the elements that will be added to each subset during each iteration. Let's say the input set has 3 elements{0, 1, 2}
, so selector will take values between 0 and 7 (inclusive), which in binary are:So, each bit could serve as an indicator if an element of the original set should be added or not. Look at the binary numbers, and just think of each number as an element of the super set in which
1
means that an element at indexj
should be added, and0
means that this element should not be added.I am using a set comprehension to generate a subset at each iteration, and I convert this subset into a
frozenset
so I can add it tops
(power set). Otherwise, I won't be able to add it because a set in Python consists only of immutable objects.Simplification
You can simplify the code using some python comprehensions, so you can get rid of those for loops. You can also use
zip
to avoid usingj
index and the code will end up as the following:That's it. What I like of this algorithm is that is clearer and more intuitive than others because it looks quite magical to rely on
itertools
even though it works as expected.这可以通过
itertools.product
非常自然地完成:This can be done very naturally with
itertools.product
:例如:
产量
For example:
yield
我知道这已经太晚了
已经有很多其他解决方案但仍然......
I know this is too late
There are many other solutions already but still...
我只是想提供最易于理解的解决方案,即反代码高尔夫版本。
结果
所有长度为 0 的集合
[()]
所有长度为 1 的集合
[('x',), ('y',), (' z',)]
长度为 2 的所有集合
[('x', 'y'), ('x', 'z'), ('y', 'z')]
所有长度为 3 的集合
[('x', 'y', 'z')]
了解更多 请参阅 itertools 文档,以及 电源组
I just wanted to provide the most comprehensible solution, the anti code-golf version.
The results
All sets of length 0
[()]
All sets of length 1
[('x',), ('y',), ('z',)]
All sets of length 2
[('x', 'y'), ('x', 'z'), ('y', 'z')]
All sets of length 3
[('x', 'y', 'z')]
For more see the itertools docs, also the wikipedia entry on power sets
对于空集(它是所有子集的一部分),您可以使用:
With empty set, which is part of all the subsets, you could use:
输出:
对于排序的子集,我们可以这样做:
输出:
Output:
For sorted subsets, we can do:
Output:
代码优先,适合那些想要简单答案的人。
我在这里有一个很好的解释 https://leetcode.com/problems /subsets/solutions/3138042/simple-python-solution/
但简短的答案是,您从空集的集合开始,即“sets = [[]]”。我建议在“for i in s”下放置一个 print 语句,即“print(sets)”,并看到它对于每个元素 i 都加倍
Code first, for those who want a simple answer.
I have a good explanation here https://leetcode.com/problems/subsets/solutions/3138042/simple-python-solution/
But the short answer is that you start with the set of the empty set, i.e. "sets = [[]]". I recommend to put a print a statement under "for i in s" i.e. "print(sets)" and see that it doubles for each element i
只是快速回顾一下电源设置!
这是查找幂集的另一种方法:
完全归功于 来源
Just a quick power set refresher !
Here is another way of finding power set:
full credit to source
如果您想要任何特定长度的子集,您可以这样做:
更一般地,对于任意长度的子集,您可以修改范围参数。输出为
[(), (0,), (1,), (2,), (3,), (0, 1), (0, 2), (0, 3), (1, 2) , (1, 3), (2, 3), (0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3), (0, 1, 2, 3)]
If you want any specific length of subsets you can do it like this:
More generally for arbitary length subsets you can modify the range arugment. The output is
[(), (0,), (1,), (2,), (3,), (0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3), (0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3), (0, 1, 2, 3)]
你可以这样做:
输出:
You can do it like this:
Output:
一种简单的方法是利用 2 的补码算术下整数的内部表示。
对于从 0 到 7 的数字,整数的二进制表示形式为 {000, 001, 010, 011, 100, 101, 110, 111}。对于整数计数器值,将 1 视为包含在集合中的相应元素,将 '0' 视为包含作为排除,我们可以根据计数序列生成子集。必须生成从
0
到pow(2,n) -1
的数字,其中 n 是数组的长度,即二进制表示的位数。基于它的简单的子集生成器函数可以写成如下。它基本上依赖
,然后可以用作
测试
在本地文件中添加以下内容
给出以下输出
A simple way would be to harness the internal representation of integers under 2's complement arithmetic.
Binary representation of integers is as {000, 001, 010, 011, 100, 101, 110, 111} for numbers ranging from 0 to 7. For an integer counter value, considering 1 as inclusion of corresponding element in collection and '0' as exclusion we can generate subsets based on the counting sequence. Numbers have to be generated from
0
topow(2,n) -1
where n is the length of array i.e. number of bits in binary representation.A simple Subset Generator Function based on it can be written as below. It basically relies
and then it can be used as
Testing
Adding following in local file
gives following output
几乎所有这些答案都使用
list
而不是set
,这对我来说有点作弊。因此,出于好奇,我尝试在set
上真正做一个简单的版本,并为其他“Python 新手”人员进行总结。我发现处理 Python 的 set 实现 时有一些奇怪的地方。对我来说主要的惊喜是处理空集。这与 Ruby 的 Set 实现,我可以简单地执行
Set[Set[]]
并获取一个包含一个空Set
的Set
,所以我最初发现它有点令人困惑。回顾一下,在使用
set
执行powerset
时,我遇到了两个问题:set()
需要一个可迭代对象,因此set(set ())
将返回set()
因为空集可迭代是空(我猜是:))set({set()} )
和set.add(set)
不起作用,因为set()
不可散列为了解决这两个问题,我使用了
frozenset()
,这意味着我不太明白我想要的(类型实际上是set
),但利用了整体set
接口。下面我们正确地得到 2² (16) 个
frozenset
作为输出:因为在 Python 中没有办法拥有
set
的set
,如果你想要将这些frozenset
转换为set
,您必须将它们映射回list
(list(map(设置,幂集(设置([1,2,3,4]))))
)或修改上面的内容。Almost all of these answers use
list
rather thanset
, which felt like a bit of a cheat to me. So, out of curiosity I tried to do a simple version truly onset
and summarize for other "new to Python" folks.I found there's a couple oddities in dealing with Python's set implementation. The main surprise to me was handling empty sets. This is in contrast to Ruby's Set implementation, where I can simply do
Set[Set[]]
and get aSet
containing one emptySet
, so I found it initially a little confusing.To review, in doing
powerset
withset
s, I encountered two problems:set()
takes an iterable, soset(set())
will returnset()
because the empty set iterable is empty (duh I guess :))set({set()})
andset.add(set)
won't work becauseset()
isn't hashableTo solve both issues, I made use of
frozenset()
, which means I don't quite get what I want (type is literallyset
), but makes use of the overallset
interace.Below we get 2² (16)
frozenset
s correctly as output:As there's no way to have a
set
ofset
s in Python, if you want to turn thesefrozenset
s intoset
s, you'll have to map them back into alist
(list(map(set, powerset(set([1,2,3,4]))))
) or modify the above.也许这个问题已经过时了,但我希望我的代码能帮助别人。
Perhaps the question is getting old, but I hope my code will help someone.
通过递归获取所有子集。疯狂的一句话
基于 Haskell 解决方案
Getting all the subsets with recursion. Crazy-ass one-liner
Based on a Haskell solution
这很疯狂,因为这些答案实际上都没有提供实际 Python 集的返回。这是一个混乱的实现,它将提供一个实际上是 Python
set
的幂集。不过,我希望看到更好的实施。
This is wild because none of these answers actually provide the return of an actual Python set. Here is a messy implementation that will give a powerset that actually is a Python
set
.I'd love to see a better implementation, though.
这是我使用组合但仅使用内置函数的快速实现。
Here is my quick implementation utilizing combinations but using only built-ins.
范围 n 内的所有子集已设置:
All subsets in range n as set:
这个问题的一个变体是我在《发现计算机科学:跨学科问题、原理和 Python 编程。2015 版》一书中看到的一个练习。在练习 10.2.11 中,输入只是一个整数,输出应该是幂集。这是我的递归解决方案(除了基本 python3 之外不使用其他任何东西)
输出是
[[], [4], [3], [4, 3], [2], [4, 2], [3, 2 ], [4, 3, 2], [1], [4, 1], [3, 1], [4, 3, 1], [2, 1], [4, 2, 1], [3 , 2, 1], [4, 3, 2, 1]]
子列表数量:16
A variation of the question, is an exercise I see on the book "Discovering Computer Science: Interdisciplinary Problems, Principles, and Python Programming. 2015 edition". In that exercise 10.2.11, the input is just an integer number, and the output should be the power sets. Here is my recursive solution (not using anything else but basic python3 )
And the output is
[[], [4], [3], [4, 3], [2], [4, 2], [3, 2], [4, 3, 2], [1], [4, 1], [3, 1], [4, 3, 1], [2, 1], [4, 2, 1], [3, 2, 1], [4, 3, 2, 1]]
Number of sublists: 16
我没有遇到过
more_itertools.powerset
函数,建议使用它。我还建议不要使用itertools.combinations
输出的默认排序,相反,您通常希望最小化位置之间的距离并对距离较短的项目子集进行排序它们之间的距离较大的项目上方/之前。itertools
食谱页面显示它使用chain.from_iterable
r
与 二项式系数,s
在数学课本和计算器中通常称为n
(“n 选择 r ”)这里的其他示例给出了
[1,2,3,4]
的幂集,其方式是 2 元组按“字典顺序”列出(当我们将数字打印为整数时) )。如果我在旁边写下数字之间的距离(即差异),它就会显示我的观点:子集的正确顺序应该是首先“耗尽”最小距离的顺序,如下所示:
在这里使用数字使此排序看起来“ ',但考虑例如字母
["a","b","c","d"]
更清楚为什么这可能有助于按此顺序获取幂集:错误 项目越多,效果越明显,并且就我的目的而言,它使得能够有意义地描述幂集索引范围之间存在差异。
(格雷码等上写了很多关于组合数学中算法的输出顺序的内容,我不认为这是一个次要问题)。
实际上,我只是编写了一个相当复杂的程序,它使用这个快速整数分区代码以正确的顺序输出值,但后来我发现了
more_itertools.powerset
并且对于大多数用途来说,只使用该函数可能就可以了像这样:⇣
我编写了一些更复杂的代码,可以很好地打印幂集(请参阅存储库以获取我未包含在此处的漂亮打印函数:
print_partitions
、print_partitions_by_length
和pprint_tuple
)。pset_partitions.py
这一切都非常简单,但如果您想要一些,仍然可能有用让您直接访问不同级别的 powerset 的代码:
作为示例,我编写了一个 CLI 演示程序,它将字符串作为命令行参数:
⇣
I hadn't come across the
more_itertools.powerset
function and would recommend using that. I also recommend not using the default ordering of the output fromitertools.combinations
, often instead you want to minimise the distance between the positions and sort the subsets of items with shorter distance between them above/before the items with larger distance between them.The
itertools
recipes page shows it useschain.from_iterable
r
here matches the standard notation for the lower part of a binomial coefficient, thes
is usually referred to asn
in mathematics texts and on calculators (“n Choose r”)The other examples here give the powerset of
[1,2,3,4]
in such a way that the 2-tuples are listed in "lexicographic" order (when we print the numbers as integers). If I write the distance between the numbers alongside it (i.e. the difference), it shows my point:The correct order for subsets should be the order which 'exhausts' the minimal distance first, like so:
Using numbers here makes this ordering look 'wrong', but consider for example the letters
["a","b","c","d"]
it is clearer why this might be useful to obtain the powerset in this order:This effect is more pronounced with more items, and for my purposes it makes the difference between being able to describe the ranges of the indexes of the powerset meaningfully.
(There is a lot written on Gray codes etc. for the output order of algorithms in combinatorics, I don't see it as a side issue).
I actually just wrote a fairly involved program which used this fast integer partition code to output the values in the proper order, but then I discovered
more_itertools.powerset
and for most uses it's probably fine to just use that function like so:⇣
I wrote some more involved code which will print the powerset nicely (see the repo for pretty printing functions I've not included here:
print_partitions
,print_partitions_by_length
, andpprint_tuple
).pset_partitions.py
This is all pretty simple, but still might be useful if you want some code that'll let you get straight to accessing the different levels of the powerset:
As an example, I wrote a CLI demo program which takes a string as a command line argument:
⇣
这是我的解决方案,它与 lmiguelvargasf 的解决方案(概念上)类似。
让我这么说
-[数学项] 根据定义,幂集确实包含空集
-[个人品味]而且我不喜欢使用frozenset。
所以输入是一个列表,输出将是一个列表的列表。该函数可以提前关闭,但我喜欢幂集的元素按字典顺序排列,这本质上意味着很好。
Here it is my solutions, it is similar (conceptually) with the solution of lmiguelvargasf.
Let me say that
-[math item] by defintion the powerset do contain the empty set
-[personal taste] and also that I don't like using frozenset.
So the input is a list and the output will be a list of lists. The function could close earlier, but I like the element of the power set to be order lexicographically, that essentially means nicely.