计算给定大小的集合的子集

发布于 2024-11-05 18:43:14 字数 363 浏览 2 评论 0原文

给定一个包含 n 个元素的集合 C（允许重复）和一个包含 n 的分区 P
P = {i1, i2, ... / i1+i2+... = n} C 在大小为 i1、i2、... 的子集中有多少种不同的分解？

示例：

C = {2 2 2 3}

P = {2 2}
C = {2 2} U {2 3}

P = {1 1 2}
C = {2} U {2} U {2 3}
C = {2} U {3} U {2 2}

P = {1 3}
C = {2} U {2 2 3}
C = {3} U {2 2 2}

我有一个解决方案，但是当 C 有十几个元素时，它的效率很低。
提前致谢
菲利普

原文

Given a set C with n elements (duplicates allowed) and a partition P of n
P = {i1, i2, ... / i1+i2+... = n}
how many different decompositions of C in subsets of size i1, i2, ... are there ?

Example :

C = {2 2 2 3}

P = {2 2}
C = {2 2} U {2 3}

P = {1 1 2}
C = {2} U {2} U {2 3}
C = {2} U {3} U {2 2}

P = {1 3}
C = {2} U {2 2 3}
C = {3} U {2 2 2}

I have a solution, but it is inefficient when C has more than a dozen of elements.
Thanks in advance
Philippe

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

两仪 2024-11-12 18:43:14

事实上，分解的顺序对你来说并不重要，这使得分解变得更加困难。也就是说，您查看的 {2 2} U {2 3} 与 {2 3} U {2 2} 相同。我仍然有一个比你更好的算法，但不是很好。

让我从一个实际复杂的例子开始。我们的集合将是ABCDEFFFFGGG G。分区将为1 1 1 1 2 2 5。

我的第一个简化是用数据结构 [[2, 4], [5, 1]] 表示集合中我们关心的信息，这意味着 2 个元素重复 4 次，5 个元素重复重复一次。

我的第二个明显的复杂之处是用 [[5, 1, 1], [2, 2, 1], [4, 1, 1] 表示分区。模式可能并不明显。每个条目的格式为[大小、计数、频率]。因此，大小为 2 的 2 个分区的一个不同实例将变为 [2, 2, 1]。我们还没有使用频率，但它正在计算具有相同大小和共同性的可区分堆。

现在我们将递归如下。我们将采用最常见的元素，并找到使用它的所有方法。因此，在我们的例子中，我们取其中一个大小为 4 的堆，发现我们可以将其划分如下，按字典顺序重新排列每个剩余的划分策略：

[4] 留下 [[1 , 1, 1], [2, 2, 1], [1, 4, 1]] = [[2, 2, 1], [1, 4, 1], [1, 1, 1]].
[3, [1, 0], 0] 留下 [[2, 1, 1], [1, 1, 1], [2, 1, 1], [1, 4, 1]] = [[2, 1, 2], [1, 4, 1], [1, 1, 1]。（请注意，我们现在使用频率。）
[3, 0, 1] 留下 [[2, 1, 1], [2, 2, 1], [0, 1 , 1], [1, 3, 1]] = [[2, 2, 1], [2, 1, 1], [1, 3, 1]]
[2, [2 , 0], 0] 留下 [[3, 1, 1], [0, 1, 1], [2, 1, 1], [1, 4, 1]] = [[ 3, 1, 1], [2, 1, 1], [1, 4, 1]]
[2, [1, 1], 0] 离开 [ [3, 1, 1], [1, 2, 1], [1, 4, 1]] = [[3, 1, 1], [1, 4, 1], [1, 2, 1]]
[2, [1, 0], [1]] 留下 [[3, 1, 1], [1, 1, 1], [2, 1, 1], [0, 1, 1], [1, 3, 1]] = [[3, 1, 1], [2, 1, 1], [1, 4, 1], [1, 1, 1]]
[2, 0, [1, 1]] 留下 `[[3, 1, 1], [2, 2, 1], [0, 2, 1 ], [1, 2, 1]] = [[3, 1, 1], [2, 2, 1], [1, 2, 1]]1 <
代码>[1, [2, 1]]< /code> 留下 [[4, 1, 1], [0, 1, 1], [1, 1, 1], [1, 4, 1]] = [[4, 1, 1], [1, 4, 1], [1, 1, 1]]
[1, [2, 0], [1]] 留下 [[4, 1, 1], [0, 1, 1], [2, 1, 1], [0, 1, 1], [1, 3, 1]] = [[4, 1, 1], [2, 1, 1], [1, 3, 1]]
[1, [1, 0], [1, 1]] 留下 [[4, 1, 1], [1, 1, 1], [2, 1, 1], [0, 2, 1], [1, 2, 1]] = [[4, 1, 1], [2, 1, 1], [1, 2, 1], [1, 1, 1]]
[1, 0, [1, 1, 1]] 留下 [[4, 1, 1], [2, 2, 1], [0, 3, 1], [1, 1, 1]] = [[4, 1, 1], [2, 2, 1], [1, 1, 1]]
[0, [2, 2]] 留下 [[5, 1, 1], [0, 2, 1], [1, 4, 1 ]] = [[5, 1, 1], [1, 4, 1]]
[0, [2, 1], [1]] 留下 [[ 5, 1, 1], [0, 1, 1], [1, 1, 1], [0, 1, 1], [1, 3, 1]] = [[5, 1, 1], [ 1, 3, 1], [1, 1, 1]]
[0, [2, 0], [1, 1]] 留下 [[5, 1 , 1], [0, 2, 1], [2, 1, 1], [0, 2, 1], [1, 2, 1]] = [[5, 1, 1], [2, 1 , 1], [1, 2, 1]]
[0, [1, 1], [1, 1]] 留下 [[5, 1, 1] , [1, 2, 1], [0, 2, 1], [1, 2, 1]] = [[5, 1, 1,], [1, 2, 2]]
[0, [1, 0], [1, 1, 1]] 离开 [[5, 1, 1], [1, 1, 1], [2, 1, 1] , [0, 3, 1], [1, 1, 1]] = [[5, 1, 1], [2, 1, 1], [1, 1, 2]]
[0, 0, [1, 1, 1, 1]] 留下 [[5, 1, 1], [2, 2, 1], [0, 4, 1]] = [ [5, 1, 1], [2, 2, 1]]

现在每个子问题都可以递归地解决。这可能感觉我们正在构建它们，但事实并非如此，因为我们记忆< /a> 递归步骤。事实证明，有很多方法可以让前两组 8 人最终得到相同的 5 人。通过记忆，我们不需要重复地重新计算这些解决方案。

也就是说，我们会做得更好。 12 个元素组成的组应该不会造成问题。但我们并没有做得更好。如果它开始在 30 个左右的元素组（具有有趣的分区集）附近崩溃，我不会感到惊讶。（我还没有编码。它可能在 30 时很好，在 50 时崩溃。我不知道它会在哪里崩溃。但考虑到您正在迭代分区集，在某个相当小的点上它 < em>将会崩溃。）

The fact that the order of decomposition does not matter to you makes it much harder. That is, you are viewing {2 2} U {2 3} as the same as {2 3} U {2 2}. Still I have an algorithm that is better than what you have, but is not great.

Let me start it with a realistically complicated example. Our set will be A B C D E F F F F G G G G. The partition will be 1 1 1 1 2 2 5.

My first simplification will be to represent the information we care about in the set with the data structure [[2, 4], [5, 1]], meaning 2 elements are repeated 4 times, and 5 are repeated once.

My second apparent complication will be to represent the partition with [[5, 1, 1], [2, 2, 1], [4, 1, 1]. The pattern may not be obvious. Each entry is of the form [size, count, frequency]. So the one distinct instance of 2 partitions of size 2 turn into [2, 2, 1]. We're not using frequency yet, but it is counting distinguishable piles of the same size and commonness.

Now we're going to recurse as follows. We'll take the most common element, and find all of the ways to use it up. So in our case we take one of the piles of size 4, and find that we can divide it as follows, rearranging each remaining partition strategy in lexicographic order:

[4] leaving [[1, 1, 1], [2, 2, 1], [1, 4, 1]] = [[2, 2, 1], [1, 4, 1], [1, 1, 1]].
[3, [1, 0], 0] leaving [[2, 1, 1], [1, 1, 1], [2, 1, 1], [1, 4, 1]] = [[2, 1, 2], [1, 4, 1], [1, 1, 1]. (Note that we're now using frequency.)
[3, 0, 1] leaving [[2, 1, 1], [2, 2, 1], [0, 1, 1], [1, 3, 1]] = [[2, 2, 1], [2, 1, 1], [1, 3, 1]]
[2, [2, 0], 0] leaving [[3, 1, 1], [0, 1, 1], [2, 1, 1], [1, 4, 1]] = [[3, 1, 1], [2, 1, 1], [1, 4, 1]]
[2, [1, 1], 0] leaving [[3, 1, 1], [1, 2, 1], [1, 4, 1]] = [[3, 1, 1], [1, 4, 1], [1, 2, 1]]
[2, [1, 0], [1]] leaving [[3, 1, 1], [1, 1, 1], [2, 1, 1], [0, 1, 1], [1, 3, 1]] = [[3, 1, 1], [2, 1, 1], [1, 4, 1], [1, 1, 1]]
[2, 0, [1, 1]] leaving `[[3, 1, 1], [2, 2, 1], [0, 2, 1], [1, 2, 1]] = [[3, 1, 1], [2, 2, 1], [1, 2, 1]]1
[1, [2, 1]] leaving [[4, 1, 1], [0, 1, 1], [1, 1, 1], [1, 4, 1]] = [[4, 1, 1], [1, 4, 1], [1, 1, 1]]
[1, [2, 0], [1]] leaving [[4, 1, 1], [0, 1, 1], [2, 1, 1], [0, 1, 1], [1, 3, 1]] = [[4, 1, 1], [2, 1, 1], [1, 3, 1]]
[1, [1, 0], [1, 1]] leaving [[4, 1, 1], [1, 1, 1], [2, 1, 1], [0, 2, 1], [1, 2, 1]] = [[4, 1, 1], [2, 1, 1], [1, 2, 1], [1, 1, 1]]
[1, 0, [1, 1, 1]] leaving [[4, 1, 1], [2, 2, 1], [0, 3, 1], [1, 1, 1]] = [[4, 1, 1], [2, 2, 1], [1, 1, 1]]
[0, [2, 2]] leaving [[5, 1, 1], [0, 2, 1], [1, 4, 1]] = [[5, 1, 1], [1, 4, 1]]
[0, [2, 1], [1]] leaving [[5, 1, 1], [0, 1, 1], [1, 1, 1], [0, 1, 1], [1, 3, 1]] = [[5, 1, 1], [1, 3, 1], [1, 1, 1]]
[0, [2, 0], [1, 1]] leaving [[5, 1, 1], [0, 2, 1], [2, 1, 1], [0, 2, 1], [1, 2, 1]] = [[5, 1, 1], [2, 1, 1], [1, 2, 1]]
[0, [1, 1], [1, 1]] leaving [[5, 1, 1], [1, 2, 1], [0, 2, 1], [1, 2, 1]] = [[5, 1, 1,], [1, 2, 2]]
[0, [1, 0], [1, 1, 1]] leaving [[5, 1, 1], [1, 1, 1], [2, 1, 1], [0, 3, 1], [1, 1, 1]] = [[5, 1, 1], [2, 1, 1], [1, 1, 2]]
[0, 0, [1, 1, 1, 1]] leaving [[5, 1, 1], [2, 2, 1], [0, 4, 1]] = [[5, 1, 1], [2, 2, 1]]

Now each of those subproblems can be solved recursively. This may feel like we're on the way to constructing them all, but we aren't, because we memoize the recursive steps. It turns out that there are a lot of ways that the first two groups of 8 can wind up with the same 5 left overs. With memoization we don't need to repeatedly recalculate those solutions.

That said, we'll do better. Groups of 12 elements should not pose a problem. But we're not doing that much better. I wouldn't be surprised if it starts breaking down somewhere around groups of 30 or so elements with interesting sets of partitions. (I haven't coded it. It may be fine at 30 and break down at 50. I don't know where it will break down. But given that you're iterating over sets of partitions, at some fairly small point it will break down.)

回复收藏 0 原文

难忘№最初的完美 2024-11-12 18:43:14

所有分区都可以在 2 个阶段中找到。

首先：从 P 中对 n 进行新的有序划分，P_S={P_i1, P_i2, ..., P_ip}，对相同的 i 求和。

P = {1, 1, 1, 1, 2, 2, 5}
P_S = (4, 4, 5)

相对于 P_S 对 C 进行划分 {C_i1, C_i2, ..., C_ip}。请注意，C_ix 与 C 一样是多集。它将C按照最终分区的大小划分为多个集合。

第二：对于每个 {C_i1, C_i2, ..., C_ip} 和每个 ix, x={1,2,...,p} 查找将 C_ix 划分为 t（P 中 ix's 的数量），其中 ix 集合元素。将此数字称为N(C_ix,ix,t)。

分区总数为：

sum by all {C_i1, C_i2, ..., C_ip} ( product N(C_ix,ix,t) ix={1,2,...,p} )

第一部分可以非常简单地递归完成。二是比较复杂。将多集 M 划分为具有 k 元素的 n 部分，与查找具有 MM.部分顺序列表的类型为：

a_1_1, a_1_2, ..., a_1_k, a_2_1, a_2_2, ..., a_2_k, ....

其中 a_i_x <= a_i_y if x <= y 和 (a_x_1, a_x_2, ..., a_x_k) 字典 <= (a_y_1, a_y_2, ..., a_y_k) 如果 x < y 。有了这两个条件，就可以递归地从 N(C_ix,ix,t) 创建所有分区。

对于某些情况，N(C_ix,ix,t) 很容易计算。将|C_ix|定义为多集C_ix中不同元素的数量。

if t = 1 than 1
if |C_ix| = 1 than 1
if |C_ix| = 2 than (let m=minimal number of occurrences of elements in C_ix) floor(m/2) + 1
 in general if |C_ix| = 2 than partition of m in numbers <= t.

All partitions can be found in 2 stages.

First: from P make new ordered partition of n, P_S={P_i1, P_i2, ..., P_ip}, summing identical i's.

P = {1, 1, 1, 1, 2, 2, 5}
P_S = (4, 4, 5)

Make partitions {C_i1, C_i2, ..., C_ip} of C with respect to P_S. Note, C_ix is multi-set like C. It is partitioning C into multi-sets by sizes of final partitions.

Second: for each {C_i1, C_i2, ..., C_ip} and for each ix, x={1,2,...,p} find number of partitions of C_ix into t (number of ix's in P) sets with ix elements. Call this number N(C_ix,ix,t).

Total number of partitions is:

sum by all {C_i1, C_i2, ..., C_ip} ( product N(C_ix,ix,t) ix={1,2,...,p} )

First part can be done recursively quite simple. Second is more complicated. Partition of multi-set M into n parts with k elements is same as finding all partially sorted list with elements from M. Partially order list is of type:

a_1_1, a_1_2, ..., a_1_k, a_2_1, a_2_2, ..., a_2_k, ....

Where a_i_x <= a_i_y if x < y and (a_x_1, a_x_2, ..., a_x_k) lexicographic <= (a_y_1, a_y_2, ..., a_y_k) if x < y. With these 2 conditions it is possible to create all partition from N(C_ix,ix,t) recursively.

For some cases N(C_ix,ix,t) is easy to calculate. Define |C_ix| as number of different elements in multi-set C_ix.

if t = 1 than 1
if |C_ix| = 1 than 1
if |C_ix| = 2 than (let m=minimal number of occurrences of elements in C_ix) floor(m/2) + 1
 in general if |C_ix| = 2 than partition of m in numbers <= t.

回复收藏 0 原文

~没有更多了~