子集和算法
我正在研究这个问题:
子集和问题将一组
X = {x1, x2 ,…, xn}
(由n
个整数组成)和另一个整数K
作为输入>。问题是检查X
是否存在其元素总和为K
的子集X'
,并查找该子集(如果有)。例如,如果X = {5, 3, 11, 8, 2}
且K = 16
则答案为YES
,因为子集X' = {5, 11}
的总和为16
。实现一个子集和算法,其运行时间至少为O(nK)
。
注意复杂度O(nK)
。我认为动态规划可能会有所帮助。
我找到了指数时间算法,但没有帮助。
有人可以帮我解决这个问题吗?
I am working on this problem:
The Subset Sum problem takes as input a set
X = {x1, x2 ,…, xn}
ofn
integers and another integerK
. The problem is to check if there exists a subsetX'
ofX
whose elements sum toK
and finds the subset if there's any. For example, ifX = {5, 3, 11, 8, 2}
andK = 16
then the answer isYES
since the subsetX' = {5, 11}
has a sum of16
. Implement an algorithm for Subset Sum whose run time is at leastO(nK)
.
Notice complexity O(nK)
. I think dynamic programming may help.
I have found an exponential time algorithm, but it doesn't help.
Can someone help me solve this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
看来我参加聚会迟到了,这是我的两分钱。我们将创建一个
boolean[] Solution[n+1][k+1]
,使得solution[i][j]
为true
如果使用前i
项(索引0
到i-1
),我们可以从集合中得到总和j
;否则假
。我们最终将返回solution[k][n]:我们可以推断出以下几点:
基于以上几点,我们可以轻松编写如下算法。
It seems I am late to the party, here are my two cents. We will create a
boolean[] solution[n+1][k+1]
such thatsolution[i][j]
istrue
if using firsti
items (index0
toi-1
) we can get sumj
from the set; elsefalse
. We will returnsolution[k][n]
finally:We can deduce the following points:
Based on the above points we can easily write the algorithm as below.
在一般情况下,没有已知的子集和算法运行时间小于 O(2^(n/2))。
There is no known algorithm for subset sum that runs in less than O(2^(n/2)), in the general case.
设 M 为所有元素的总和。
注意 K<=M
然后简单测试 m[k]
let M be the sum of all the elements.
Note that K<=M
Then simply test for m[k]
考虑第 i 个元素。它要么会对子集总和做出贡献,要么不会。如果它对总和有贡献,则“总和的值”会减少等于第 i 个元素的值。如果它没有贡献,那么我们需要在剩余元素中搜索“和的值”。
Consider i-th element. Either it will contribute for the subset sum or will not. if it contributes for the sum, then the "value of sum" is decreased by the value equal to the i-th element. If it does not contribute, then we need to search for the "value of sum" in the remaining elements.
上面的答案都很好,但并没有真正给出像这样的东西如何适用于正数和负数的最广泛概述。
给定一个有序的整数集,定义两个变量 X 和 Y,使得
X = 负元素之和
Y = 正元素之和
,并按此顺序应用这些规则来对初始集进行操作,就像在二叉树中递归一样
for return true
set,从排序数组中删除最右边的元素
数组 q,如果 X <= B <= Y 则返回 true,如果不返回 false
上面的答案更详细和准确,但对于非常广泛地了解这应该如何发挥作用,绘制一棵二叉树。这个长度对运行时间有何影响?
The above answers are all great but don't really give the broadest overview of how something like this could work for both positive and negative numbers.
Given an ordered set of integers, Define two variables X and Y such that
X = sum of negative elements
Y = sum of positive elements
and operate on your initial set as if you were recursing through a binary tree by applying these rules in this order
for return true
set, drop the rightmost element from your sorted array
array q, if X <= B <= Y then return true, if not return false
The above answers are more detailed and accurate but for a very broad view of how this should play out draw a binary tree. What does the length of this suggest about the runtime?
蛮力——忘记排序,尝试每一个组合,eval 解析器击败 Array.reduce(它也适用于负数)。
Brute force-- forget sorting, try every combo, and the eval parser beats Array.reduce (and it works with negative numbers too).
时间复杂度为 n^2 的递归解决方案
最坏情况复杂度: O(n^2)
最好情况: O(n) 即;如果第一个元素构成一个子集,其总和等于给定的总和。
如果我在这里计算时间复杂度是错误的,请纠正我。
Recursive Solution with n^2 time complexity
Worst Case Complexity: O(n^2)
Best case : O(n) i.e; if first element makes a subset whose sum is equal to given sum.
Correct me if i am wrong to calculate time complexity here.
一维数组的 DP 解决方案(DP 数组处理顺序在这里很重要)。
DP solution with one dimensional array(DP array processing order does matter here).
Subset Sum 是我在 Macalester 学到的第一个 NP 完全问题。这个问题被浏览了 36000 多次,但我没有看到足够的答案来详细解释算法的逻辑。所以我想我尝试这样做。
假设:
为了简单起见,我首先假设输入集
X
仅包含正整数,并且k
为正数。但是,我们可以调整算法来处理负整数以及 k 为负数的情况。逻辑:
该算法或任何 DP 问题的关键都是分解问题并简单地从基本情况开始。然后我们就可以使用我们已知的一些知识建立在基本情况的基础上:
X
包含k
,那么它的子集和为k
。x1
的子集是X
的子集,其总和为k1
,那么X
将有总和为k1
的子集,即x1
。X = {x1, x1, x3, ......., xn, xn+1}
。如果x1 = {x1, x1, x3, ......., xn}
的子集和为k1
,我们知道它的子集和为k1
>k - k1。举例说明1、2、3、4:
你不能有任何子集总和。
集合
X = {4}
的子集总和为 4,因为 4 本身是集合的一部分假设您有一个集合
x1 = {1,3,5}
,它是集合X = 的子集{1,3,5,2,8}
。如果x1
的子集和为k1 = 8
,则意味着X
的子集和为 8,因为x1
> 是X
的子集
X = {1,3,5,2,19}
,我们想知道它是否有一个子集和20. 确实如此,并且有一种方法可以知道x1 = {1,3,5,2}
之和是否可以等于 (20 - 19) = 1。因为 x1 的子集总和为 1然后当我们将 19 添加到集合 x1 中时我们可以用新的数字 1 + 19 = 20 来创建我们想要的总和 20。
动态构建矩阵
凉爽的!现在让我们利用上述四个逻辑并从基本案例开始构建。我们将构建一个矩阵
m
。我们定义:矩阵
m
有i+1
行和k + 1
列。矩阵的每个单元格都有值
true
或false
。m[i][s] 返回 true 或 false 来指示此问题的答案:“使用数组中的第一个
i
项,我们可以找到s< 的子集总和/code>? "
m[i][s]
表示是,返回true
,表示否(请注意维基百科的答案或大多数人构建一个函数 m(i, s)但我认为矩阵是理解动态编程的一种简单方法,当我们在集合或数组中只有正数时它效果很好,但是函数路径更好,因为您不必处理超出范围的索引。 ,匹配数组的索引并将总和与矩阵......)
让我们使用示例构建矩阵:
我们将逐行构建矩阵。我们最终想知道单元格 m[n][k] 包含
true
或false
。第一行:
逻辑 1. 告诉我们矩阵的第一行应该全部为
false
。第二行及以上:
然后对于第二行或以上,我们可以使用逻辑2,3,4来帮助我们填充矩阵。
如果其中任何一个为
true
,则m[i][s]
为true
,否则为false
。所以我们可以将 2,3,4 重写为 m[i][s] = (m[i-1][s] || a[i-1] == s || m[i-1] [s - a[i-1]])使用上述逻辑来填充矩阵
m
。在我们的示例中,它看起来像这样。现在使用矩阵来回答您的问题:
看看
m[5][9]
,这是原始问题。使用前 5 个项目(即所有项目)我们可以找到 9 (k) 的子集和吗?答案由该单元格表示,即true
这是代码:
要构建矩阵
m
需要 O((n+1)(k+1)),其中是O(nk)。看起来它应该是多项式,但事实并非如此!它实际上是伪多项式。请在此处阅读相关内容。同样,这只在输入仅包含正数时才有效。您可以轻松调整它以处理负数。该矩阵仍将有 n+1 行,但有
B - A + 1
列。其中B
是上限,A
是下限(+1 包括零)。矩阵仍然是 您必须偏移s
与下界。用文本从头到尾解释 DP 问题是相当困难的。但我希望这可以帮助那些试图理解这个问题的人。
请注意,在上面的示例中,DP 表的行已排序。情况并非一定如此。
这是问题情况的 DP 表,即给定一组 {5, 3, 11, 8, 2}。为了简洁起见,我省略了错误值。
下面是 JavaScript 中的一个实现,它将输出目标集 {5, 11}:
Subset Sum is the first NP-complete problem I learned at Macalester. This question is viewed 36000+ times yet I don't see a sufficient answer that explains the algorithm in detail with logic. So I thought I make an attempt to do so.
Assumption:
For the sake of simplicity first I made the assumption that the input set
X
contains only positive integers andk
is positive. However, we can tweak the algorithm to handle negative integers and the case ifk
is negative.Logic:
The key to this algorithm or really any DP problem is to break down the problem and start simply from a base case. then we can build on the base case using some knowledge that we know:
X
is empty then there is no way we can sum to any value ofk
.X
containsk
then it has a subset sum tok
.x1
who is a subset ofX
sum tok1
thenX
will have a subset that sum tok1
namelyx1
.X = {x1, x1, x3, ......., xn, xn+1}
. We know it has a subset sum tok1
ifx1 = {x1, x1, x3, ......., xn}
has a subset sum tok - k1
.Example to illustrate 1,2,3,4:
you can't have any subset sum.
A set
X = {4}
has a subset sum to 4 because 4 it self is part of the setsay you have a set
x1 = {1,3,5}
who is a subset of setX = {1,3,5,2,8}
. ifx1
has a subset sum tok1 = 8
then that meansX
also has a subset sum to 8 becausex1
is a subset ofX
X = {1,3,5,2,19}
and we want to know does it have a subset sum to 20. It does and one way to can know if that isx1 = {1,3,5,2}
can sum to (20 - 19) = 1. Since x1 has a subset sum to 1 then when we add 19 to the set x1we can take that new number 1 + 19 = 20 to create our desired sum 20.
Dynamically build a matrix
Cool! now let us utilize the above four logics and start building from the base case. We are going to build a matrix
m
. We define:matrix
m
hasi+1
rows andk + 1
columns.Each cell of the matrix has value
true
orfalse
.m[i][s] returns true or false to indicate the answer to this question: "using the first
i
items in the array can we find a subset sum tos
? "m[i][s]
returnstrue
for yes andfalse
for no(note the Wikipedia answer or most of the people build a function m(i,s) but I thought the matrix is an easy way to understand dynamic programming. It works well when we only have positive numbers in the set or array. However the function route is better because you don't have to deal with index out of range, match index of the array and sum to the matrix.....)
Let us build the matrix using an example:
We are going to build the matrix row by row. We ultimately want to know the cell m[n][k] contain
true
orfalse
.First Row:
Logic 1. told us that the first row of the matrix should all be
false
.Second Row and above:
Then for the second row or above, we can use logic 2,3,4 to help us populate the matrix.
m[i][s] = (X[i-1] == s)
rememebr m[i] is referring to the ith item in X which is X[i-1]m[i][s] = (m[i-1][s])
this is looking at the cell direclty above.m[i][s] = (m[i-1][s - X[i-1]])
this is looking at the row above and left of X[i-1] cells.If any of these is
true
thenm[i][s]
istrue
otherwisefalse
. so we can rewrite 2,3,4 intom[i][s] = (m[i-1][s] || a[i-1] == s || m[i-1][s - a[i-1]])
Use these above logics to populate the matrix
m
. In our example, it looks like this.Now use the matrix to answer your question:
look at
m[5][9]
which is the original question. using the first 5 items (which is all the items) can we find a subset sum to 9 (k)? and the answer is indicated by that cell which istrue
Here is the Code:
To build the matrix
m
takes O((n+1)(k+1)) which is O(nk). it seems like it should be polynomial but it is not! It's actually pseudo polynomial. Read about it hereAgain this only works if the input only contains positive numbers. You can easily tweak it to work with negative numbers tho. The matrix would still have n+1 rows but
B - A + 1
columns. WhereB
is the upperbound andA
is the lowerbound ( +1 to include zero).The matrix would still be You would have to offsets
with the lowerbound.It is pretty hard to explain DP problem over text from start to end. But I hope this helps those out there that try to understand this problem.
Note that in the examples above the rows of the DP table are sorted. That does not have to be the case.
Here is a DP table for the case of the question, i.e. given a set of {5, 3, 11, 8, 2}. For brevity, I have omitted the false values.
Below is an implementation in JavaScript which will output the target set {5, 11}:
由于看起来所有数字都是正数,因此您可以使用动态编程来解决此问题:
Start 将是一个大小为 K+1 的布尔数组
possible
,第一个值为 true,其余为 false。第 i 个值将表示是否可以实现 i 的子集和。对于集合中的每个数字 n,循环遍历 possible 数组,如果第 i 个值为 true,则将第 i+n 个值也设置为 true。最后,如果
possible
中的第k个值为true,那么就可以形成k的子集和。问题在 O(NK) 时间内解决。维基百科关于子集和问题的页面详细解释了应用于集合的算法不保证为正整数。
Since it looks like all your numbers are positive, you can solve this using dynamic programming:
Start will a boolean array
possible
of size K+1 with the first value true, the rest false. The ith value will represent whether a subset sum of i is possible to achieve. For each number n in your set, loop through thepossible
array and if the ith value is true, then set the i+nth value to true as well.At the end, if the kth value in
possible
is true, then you can form a subset sum of k. Problem solved in O(NK) time.Wikipedia's page on the subset sum problem has a detailed explanation of this algorithm applied to sets of integers not guaranteed to be positive.
我建议阅读 Wiki 的算法。该算法存在于其中,
O(P*n)
解参见伪多项式时间动态规划解,该解不是多项式时间,是 (p, n) 但它不是 n+log P(输入大小)中的多项式,并且因为P
可能非常大,如 2^n,所以解 P*n = (2^n)*n 不是一般而言,多项式时间解是多项式时间算法,但当 p 受 n 的某个多项式函数限制时,则为多项式时间算法。这个问题是NPC,但有一个
伪多项式时间
算法,属于弱 NP 完全性
问题,还有强NP完全
问题,这意味着,除非P=NP,否则你找不到任何伪多项式时间
算法,并且这个问题不存在这一系列的问题,所以不知何故很容易。我说得尽可能简单,但这并不是强 NP 完全问题或弱 NP 完全问题的精确定义。
有关详细信息,请参阅Garey 和 Johnson 第 4 章。
I'd suggest to read Wiki's algorithm. The algorithm exists there, see Pseudo-polynomial time dynamic programming solution for the
O(P*n)
solution, The solution is not polynomial time, is polynomial in (p,n) but it's not polynomial in n+log P (size of input) and becauseP
can be very large like 2^n, the solution P*n = (2^n)*n is not a polynomial time solution in general, but when p is bounded by some polynomial function of n is polynomial time algorithm.This problem is NPC, but there is a
Pseudo polynomial time
algorithm for it, and belongs toweakly NP-Complete
problems, Also there areStrongly NP-Complete
problems, which means, you can't find anypseudo polynomial time
algorithm for them unless P=NP, and this problem is not in this range of problems, So somehow is easy.I said this as simple as possible,but it's not an exact definition of a Strongly NP-Complete or Weakly NP-Complete problems.
For detail see Garey and Johnson chapter 4.