具有固定子集大小的子集和

发布于 2024-12-27 15:11:24 字数 1134 浏览 4 评论 0原文

总和子集问题指出:

给定一组整数,是否存在和为零的非空子集?

这个问题一般来说是NP完全问题。我很好奇这个轻微变体的复杂性是否已知:

给定一组整数,是否存在大小为 k 且总和为零的子集?

例如,如果k = 1,您可以进行二分搜索以在O(log n)中找到答案。如果k = 2,那么你可以将其降低到O(n log n)(例如参见从数组中查找总和等于给定数字的一对元素)。如果k = 3,那么您可以执行O(n^2)(例如参见在数组中查找总和最接近给定数字的三个元素)。

是否存在可以将此问题作为 k 函数的已知界限?

作为动机,我正在考虑这个问题如何将数组分区为2 个部分,使得这两个部分有平均数相等? 并尝试确定它是否实际上是 NP 完全的。答案在于是否存在如上所述的公式。

除非有通用的解决方案,否则我对了解 k=4 的最佳界限非常感兴趣。

The sum-subset problem states:

Given a set of integers, is there a non-empty subset whose sum is zero?

This problem is NP-complete in general. I'm curious if the complexity of this slight variant is known:

Given a set of integers, is there a subset of size k whose sum is zero?

For example, if k = 1, you can do a binary search to find the answer in O(log n). If k = 2, then you can get it down to O(n log n) (e.g. see Find a pair of elements from an array whose sum equals a given number). If k = 3, then you can do O(n^2) (e.g. see Finding three elements in an array whose sum is closest to a given number).

Is there a known bound that can be placed on this problem as a function of k?

As motivation, I was thinking about this question How do you partition an array into 2 parts such that the two parts have equal average? and trying to determine if it is actually NP-complete. The answer lies in whether or not there is a formula as described above.

Barring a general solution, I'd be very interested in knowing an optimal bound for k=4.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

恋你朝朝暮暮 2025-01-03 15:11:24

对于 k=4,空间复杂度 O(n),时间复杂度 O(n2 * log(n))

对数组进行排序。从 2 个最小和 2 个最大元素开始,按非递减顺序计算 2 个元素 (a[i] + a[j]) 的所有较小总和,并且所有 < code>大于 2 个元素(a[k] + a[l]) 按非递增顺序求和。如果总和小于零,则增加 lesser sum;如果总和大于零,则减少 greater 1;当总和为零(成功)或 a 时停止[i]+a[j]> a[k] + a[l](失败)。

技巧是以这样的方式迭代所有索引 ij,即 (a[i] + a[j])永远不会减少。对于 kl(a[k] + a[l]) 永远不应该增加。优先级队列有助于做到这一点:

  1. key=(a[i] + a[j]), value=(i = 0, j = 1) 放入优先级队列。
  2. 从优先级队列中弹出 (sum, i, j)
  3. 在上述算法中使用 sum
  4. (a[i+1] + a[j]), i+1, j(a[i] + a[j+1]), i, j+1 仅当这些元素尚未使用时才放入优先级队列。要跟踪使用的元素,请为每个“i”维护一个最大使用的“j”数组。仅使用大于“i”的“j”值就足够了。
  5. 从步骤 2 继续。

对于 k>4

如果空间复杂度限制为 O(n),我找不到比对 k-4 值使用暴力破解更好的方法了上述算法适用于剩余的 4 值。时间复杂度 O(n(k-2) * log(n))。

对于非常大的k整数线性规划可能会带来一些改进。

更新

如果n非常大(与最大整数值的顺序相同),则可以实现 O(1) 优先级队列,将复杂度提高到 O(n< support>2) 和 O(n(k-2))。

如果n >= k * INT_MAX,则空间复杂度为 O(n) 的不同算法是可能的。预先计算所有可能的 k/2 值之和的位集。并用它来检查其他 k/2 值的总和。时间复杂度为 O(n(ceil(k/2)))。

For k=4, space complexity O(n), time complexity O(n2 * log(n))

Sort the array. Starting from 2 smallest and 2 largest elements, calculate all lesser sums of 2 elements (a[i] + a[j]) in the non-decreasing order and all greater sums of 2 elements (a[k] + a[l]) in the non-increasing order. Increase lesser sum if total sum is less than zero, decrease greater one if total sum is greater than zero, stop when total sum is zero (success) or a[i] + a[j] > a[k] + a[l] (failure).

The trick is to iterate through all the indexes i and j in such a way, that (a[i] + a[j]) will never decrease. And for k and l, (a[k] + a[l]) should never increase. A priority queue helps to do this:

  1. Put key=(a[i] + a[j]), value=(i = 0, j = 1) to priority queue.
  2. Pop (sum, i, j) from priority queue.
  3. Use sum in the above algorithm.
  4. Put (a[i+1] + a[j]), i+1, j and (a[i] + a[j+1]), i, j+1 to priority queue only if these elements were not already used. To keep track of used elements, maintain an array of maximal used 'j' for each 'i'. It is enough to use only values for 'j', that are greater, than 'i'.
  5. Continue from step 2.

For k>4

If space complexity is limited to O(n), I cannot find anything better, than use brute force for k-4 values and the above algorithm for the remaining 4 values. Time complexity O(n(k-2) * log(n)).

For very large k integer linear programming may give some improvement.

Update

If n is very large (on the same order as maximum integer value), it is possible to implement O(1) priority queue, improving complexities to O(n2) and O(n(k-2)).

If n >= k * INT_MAX, different algorithm with O(n) space complexity is possible. Precalculate a bitset for all possible sums of k/2 values. And use it to check sums of other k/2 values. Time complexity is O(n(ceil(k/2))).

阪姬 2025-01-03 15:11:24

判断W+X+Y+Z={w+x+y+z|w+x+y+z|中是否为0的问题w in W, x in X, y in Y, z in Z} 基本上是相同的,除了没有烦人的退化情况(即,问题可以用最少的资源相互简化)。

该问题(以及 k = 4 的原始问题)具有 O(n^2 log n) 时间、O(n) 空间算法。 k = 2 的 O(n log n) 时间算法(确定 A + B 中是否为 0)按排序顺序访问 A,按反向排序顺序访问 B。因此,我们需要的是 A = W + X 的 O(n) 空间迭代器,它可以对称地重用 B = Y + Z。让 W = {w1, ..., wn} 按排序顺序。对于X中的所有x,将一个键值项(w1 + x, (1, x))插入到优先级队列中。重复删除最小元素 (wi + x, (i, x)) 并插入 (wi+1 + x, (i+1, x))。

The problem of determining whether 0 in W + X + Y + Z = {w + x + y + z | w in W, x in X, y in Y, z in Z} is basically the same except for not having annoying degenerate cases (i.e., the problems are inter-reducible with minimal resources).

This problem (and thus the original for k = 4) has an O(n^2 log n)-time, O(n)-space algorithm. The O(n log n)-time algorithm for k = 2 (to determine whether 0 in A + B) accesses A in sorted order and B in reverse sorted order. Thus all we need is an O(n)-space iterator for A = W + X, which can be reused symmetrically for B = Y + Z. Let W = {w1, ..., wn} in sorted order. For all x in X, insert a key-value item (w1 + x, (1, x)) into a priority queue. Repeatedly remove the min element (wi + x, (i, x)) and insert (wi+1 + x, (i+1, x)).

苦行僧 2025-01-03 15:11:24

非常相似的问题:

这个变体是子集和问题更容易解决吗?

它仍然是 NP 完全的。

如果不是,子集和也将在 P 中,因为它可以表示为 F(1) | F(2) | F(2) | ... F(n) 其中 F 是您的函数。这将有 O(O(F(1)) + O(F(2)) + O(F(n))) ,它仍然是多项式,这是不正确的,因为我们知道它是 NP -完全的。

请注意,如果您对输入有一定的限制,则可以获得多项式时间。

另请注意,暴力运行时间可以使用二项式系数计算。

Question that is very similar:

Is this variant of the subset sum problem easier to solve?

It's still NP-complete.

If it were not, the subset-sum would also be in P, as it could be represented as F(1) | F(2) | ... F(n) where F is your function. This would have O(O(F(1)) + O(F(2)) + O(F(n))) which would still be polynomial, which is incorrect as we know it's NP-complete.

Note that if you have certain bounds on the inputs you can achieve polynomial time.

Also note that the brute-force runtime can be calculated with binomial coefficients.

じ违心 2025-01-03 15:11:24

O(n^2log(n)) 中 k=4 的解

步骤 1:计算两两求和并对列表进行排序。有 n(n-1)/2 个和。所以复杂度是O(n^2log(n))。保留获得总和的个人的身份。

步骤 2:对于上面列表中的每个元素,搜索补集并确保它们不共享“个体”。有 n^2 次搜索,每个搜索的复杂度为 O(log(n))

编辑:空间复杂度原始算法是 O(n^2)。通过模拟虚拟 2D 矩阵,可以将空间复杂度降低到 O(1)(如果

首先考虑存储数组的排序版本,则空间复杂度为 O(n))。矩阵:对数字进行排序并创建一个现在,矩阵 X 的所有行和列都已排序。要在该矩阵中搜索值,请搜索对角线上的数字(如果数字位于 X[i,i 之间)。 ] 和 X[i+1,i+1],基本上可以通过矩阵 X[i:N, 0:i] 和 X[0:i, i:N] 将搜索空间减半,结果搜索算法为: O(log^2n)(我是不太确定。有人可以检查一下吗?)

现在,不要使用实际矩阵,而是使用虚拟矩阵,其中 X[i,j] 是根据需要计算的,而不是预先计算它们

。 )^2).

PS:在下面的链接中,它说2D排序矩阵搜索的复杂度是O(n)复杂度。如果这是真的(即 O(log^2n) 不正确),则最终复杂度为 O(n^3)。

The solution for k=4 in O(n^2log(n))

Step 1: Calculate the pairwise sum and sort the list. There are n(n-1)/2 sums. So the complexity is O(n^2log(n)). Keep the identities of the individuals which make the sum.

Step 2: For each element in the above list search for the complement and make sure they don't share "the individuals). There are n^2 searches, each with complexity O(log(n))

EDIT: The space complexity of the original algorithm is O(n^2). The space complexity can be reduced to O(1) by simulating a virtual 2D matrix (O(n), if you consider space to store sorted version of the array).

First about 2D matrix: sort the numbers and create a matrix X using pairwise sums. Now the matrix is ins such a way that all the rows and columns are sorted. To search for a value in this matrix, search the numbers on the diagonal. If the number is in between X[i,i] and X[i+1,i+1], you can basically halve the search space by to matrices X[i:N, 0:i] and X[0:i, i:N]. The resulting search algorithm is O(log^2n) (I AM NOT VERY SURE. CAN SOMEBODY CHECK IT?).

Now, instead of using a real matrix, use a virtual matrix where X[i,j] are calculated as needed instead of pre-computing them.

Resulting time complexity: O( (nlogn)^2 ).

PS: In the following link, it says the complexity of 2D sorted matrix search is O(n) complexity. If that is true (i.e. O(log^2n) is incorrect), then the finally complexity is O(n^3).

毅然前行 2025-01-03 15:11:24

以 awesomo 的答案为基础……如果我们可以假设数字已排序,那么对于给定的 k,我们可以比 O(n^k) 做得更好;只需获取大小为 (k-1) 的所有 O(n^(k-1)) 子集,然后对剩余的数字进行二分搜索,该数字在添加到第一个 (k-1) 时给出目标。这是 O(n^(k-1) log n)。这意味着复杂性肯定低于此。

事实上,如果我们知道 k=3 时的复杂度为 O(n^2),那么当 k > 时我们可以做得更好。 3:选择所有(k-3)子集,其中有O(n^(k-3)),然后在O(n^2)内对剩余元素求解问题。对于 k >= 3,这是 O(n^(k-1))。

但是,也许你可以做得更好?我会考虑一下这个问题。

编辑:我最初打算添加很多内容,提出对这个问题的不同看法,但我决定发布一个删节版本。我鼓励其他发帖者看看他们是否认为这个想法有任何优点。分析很困难,但可能足够疯狂,可以发挥作用。

我们可以利用 k 固定以及奇数和偶数之和以某种方式表现的事实来定义递归算法来解决这个问题。

首先,修改问题,使列表中同时包含偶数和奇数(如果全部为偶数,则可以通过除以 2 来完成,或者如果全部为奇数,则可以通过从数字中减去 1 并从目标总和中减去 k 来完成,然后重复如有必要)。

接下来,利用以下事实:仅使用偶数个奇数才能达到偶数目标和,并且仅使用奇数个奇数才能达到奇数目标和。生成适当的奇数子集,并使用偶数递归调用算法,总和减去正在检查的奇数子集的总和,以及 k 减去奇数子集的大小。当k=1时,进行二分查找。如果曾经k> n(不确定这会发生),返回 false。

如果奇数很少,这可以让您非常快速地选择必须属于获胜子集的术语,或丢弃不属于获胜子集的术语。通过使用减法技巧,您可以将具有大量偶数的问题转换为具有大量奇数的等效问题。因此,最糟糕的情况一定是当偶数和奇数的数字非常相似时......这就是我现在所处的情况。一个无用的宽松上限比暴力破解要差很多数量级,但我觉得这可能至少和暴力破解一样好。欢迎提出想法!

EDIT2:以上示例,用于说明。

{1, 2, 2, 6, 7, 7, 20}, k = 3, sum = 20.
Subset {}:
 {2, 2, 6, 20}, k = 3, sum = 20
 = {1, 1, 3, 10}, k = 3, sum = 10
 Subset {}:
  {10}, k = 3, sum = 10
  Failure
 Subset {1, 1}:
  {10}, k = 1, sum = 8
  Failure
 Subset {1, 3}:
  {10}, k = 1, sum = 6
  Failure
Subset {1, 7}:
 {2, 2, 6, 20}, k = 1, sum = 12
 Failure
Subset {7, 7}:
 {2, 2, 6, 20}, k = 1, sum = 6
 Success

To build on awesomo's answer... if we can assume that numbers are sorted, we can do better than O(n^k) for given k; simply take all O(n^(k-1)) subsets of size (k-1), then do a binary search in what remains for a number that, when added to the first (k-1), gives the target. This is O(n^(k-1) log n). This means the complexity is certainly less than that.

In fact, if we know that the complexity is O(n^2) for k=3, we can do even better for k > 3: choose all (k-3)-subsets, of which there are O(n^(k-3)), and then solve the problem in O(n^2) on the remaining elements. This is O(n^(k-1)) for k >= 3.

However, maybe you can do even better? I'll think about this one.

EDIT: I was initially going to add a lot proposing a different take on this problem, but I've decided to post an abridged version. I encourage other posters to see whether they believe this idea has any merit. The analysis is tough, but it might just be crazy enough to work.

We can use the fact that we have a fixed k, and that sums of odd and even numbers behave in certain ways, to define a recursive algorithm to solve this problem.

First, modify the problem so that you have both even and odd numbers in the list (this can be accomplished by dividing by two if all are even, or by subtracting 1 from numbers and k from the target sum if all are odd, and repeating as necessary).

Next, use the fact that even target sums can be reached only by using an even number of odd numbers, and odd target sums can be reached using only an odd number of odd numbers. Generate appropriate subsets of the odd numbers, and call the algorithm recursively using the even numbers, the sum minus the sum of the subset of odd numbers being examined, and k minus the size of the subset of odd numbers. When k = 1, do binary search. If ever k > n (not sure this can happen), return false.

If you have very few odd numbers, this could allow you to very quickly pick up terms that must be part of a winning subset, or discard ones that cannot. You can transform problems with lots of even numbers to equivalent problems with lots of odd numbers by using the subtraction trick. The worst case must therefore be when the numbers of even and odd numbers are very similar... and that's where I am right now. A uselessly loose upper bound on this is many orders of magnitudes worse than brute-force, but I feel like this is probably at least as good as brute-force. Thoughts are welcome!

EDIT2: An example of the above, for illustration.

{1, 2, 2, 6, 7, 7, 20}, k = 3, sum = 20.
Subset {}:
 {2, 2, 6, 20}, k = 3, sum = 20
 = {1, 1, 3, 10}, k = 3, sum = 10
 Subset {}:
  {10}, k = 3, sum = 10
  Failure
 Subset {1, 1}:
  {10}, k = 1, sum = 8
  Failure
 Subset {1, 3}:
  {10}, k = 1, sum = 6
  Failure
Subset {1, 7}:
 {2, 2, 6, 20}, k = 1, sum = 12
 Failure
Subset {7, 7}:
 {2, 2, 6, 20}, k = 1, sum = 6
 Success
青萝楚歌 2025-01-03 15:11:24

时间复杂度仅为 O(n^k)n 元素中 k 大小的子集的数量)。

由于 k 是给定的常数,因此(可能是相当高阶的)多项式上限将复杂性作为 n 的函数。

The time complexity is trivially O(n^k) (number of k-sized subsets from n elements).

Since k is a given constant, a (possibly quite high-order) polynomial upper bounds the complexity as a function of n.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文