为什么背包问题是伪多项式?

发布于 2024-10-09 10:59:28 字数 146 浏览 7 评论 0原文

我知道 Knapsack 是 NP 完全的,但可以通过 DP 解决。他们说 DP 解决方案是伪多项式,因为它在“输入长度”(即对输入进行编码所需的位数)上呈指数关系。不幸的是我没有得到它。有人可以慢慢地向我解释一下伪多项式吗?

I know that Knapsack is NP-complete while it can be solved by DP. They say that the DP solution is pseudo-polynomial, since it is exponential in the "length of input" (i.e. the numbers of bits required to encode the input). Unfortunately I did not get it. Can anybody explain that pseudo-polynomial thing to me slowly ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

指尖上得阳光 2024-10-16 10:59:28

对于包含 N 个物品和大小为 W 的背包的无界背包问题,运行时间为 O(NW)。不过,W 在输入长度上不是多项式,这使得它成为伪多项式。

考虑 W = 1,000,000,000,000。只需 40 位即可表示该数字,因此输入大小 = 40,但计算运行时使用因子 1,000,000,000,000,即 O(240)。

因此,更准确地说,运行时间为 O(N.2bits in W),这是指数级的。

另请参阅:

The running time is O(NW) for an unbounded knapsack problem with N items and knapsack of size W. W is not polynomial in the length of the input though, which is what makes it pseudo-polynomial.

Consider W = 1,000,000,000,000. It only takes 40 bits to represent this number, so input size = 40, but the computational runtime uses the factor 1,000,000,000,000 which is O(240).

So the runtime is more accurately said to be O(N.2bits in W), which is exponential.

Also see:

樱花落人离去 2024-10-16 10:59:28

在我们的大多数问题中,我们正在处理大量数字列表,这些数字非常适合标准 int/float 数据类型。由于大多数处理器的构建方式是一次处理 4-8 字节数字,无需额外成本(相对于适合的数字,例如 1 字节),因此我们很少会遇到因扩大数字或增加数字而导致运行时间发生变化的情况。低于我们在实际问题中遇到的范围 - 因此主导因素仍然只是数据点的绝对数量,即我们习惯的 n 或 m 因素。

(你可以想象 Big-O 表示法隐藏了一个常数因子,该因子将每个数据划分为 32 或 64 位,只要我们的每个数字适合那么多位或更少,就只留下数据点的数量)

但是尝试使用其他算法重新设计,以处理涉及大整数的数据集(需要超过 8 个字节来表示的数字),并看看这对运行时有何影响。所涉及的数字的大小总是会产生影响,即使在二进制排序等其他算法中,一旦扩展到安全缓冲区之外,传统处理器就会通过处理 4-8 字节批次为我们“免费”提供支持。

我们讨论的背包算法的技巧在于,它对特定参数 W 的大小异常敏感(相对于其他算法)。向 W 添加一位,算法的运行时间就会增加一倍。在此之前,我们还没有在其他算法中看到过对值变化的这种戏剧性响应,这就是为什么我们似乎以不同的方式对待 Knapsack - 但这是对其如何以非多项式方式响应的真实分析输入大小的变化。

In most of our problems, we're dealing with large lists of numbers which fit comfortably inside standard int/float data types. Because of the way most processors are built to handle 4-8 byte numbers at a time at no additional cost (relative to numbers than fit in, say, 1 byte), we rarely encounter a change in running time from scaling our numbers up or down within ranges we encounter in real problems - so the dominant factor remains just the sheer quantity of data points, the n or m factors that we're used to.

(You can imagine that the Big-O notation is hiding a constant factor that divides-out 32 or 64 bits-per-datum, leaving only the number-of-data-points whenever each of our numbers fit in that many bits or less)

But try reworking with other algorithms to act on data sets involving big ints - numbers that require more than 8 bytes to represent - and see what that does to the runtime. The magnitude of the numbers involved always makes a difference, even in the other algorithms like binary sort, once you expand beyond the buffer of safety conventional processors give us "for-free" by handling 4-8 byte batches.

The trick with the Knapsack algorithm that we discussed is that it's unusually sensitive (relative to other algorithms ) to the magnitude of a particular parameter, W. Add one bit to W and you double the running time of the algorithm. We haven't seen that kind of dramatic response to changes in value in other algorithms before this one, which is why it might seem like we're treating Knapsack differently - but that's a genuine analysis of how it responds in a non-polynomial fashion to changes in input size.

冬天的雪花 2024-10-16 10:59:28

我理解这一点的方式是,如果容量输入是 [1,2,...,W] 数组,其大小为 W,则容量将为 O(W)但容量输入不是数字数组,而是单个整数。时间复杂度与输入的大小关系有关。整数的大小不是整数的值,而是表示它的位数。我们稍后在算法中将这个整数W转换为数组[1,2,...,W],导致人们错误地认为W是大小,但这个数组不是输入,整数本身才是。

将输入视为“一组内容”,将大小视为“数组中有多少内容”。项目输入实际上是一个由 n 个项目组成的数组,因此 size=n。 容量输入不是一个包含 W 数字的数组而是一个整数,由 log(W) 位数组表示。将其大小增加 1(添加 1 个有意义的位),W 加倍,因此运行时间加倍,因此时间复杂度呈指数级。

The way I understand this is that the capacity would've been O(W) if the capacity input were an array of [1,2,...,W], which has a size of W. But the capacity input is not an array of numbers, it's instead a single integer. The time complexity is about the relationship to the size of input. The size of an integer is NOT the value of the integer, but the number of bits representing it. We do later convert this integer W into an array [1,2,...,W] in the algorithm, leading people into mistakenly thinking W is the size, but this array is not the input, the integer itself is.

Think of input as "an array of stuff", and the size as "how many stuff in the array". The item input is actually an array of n items in the array so size=n. The capacity input is NOT an array of W numbers in it, but a single integer, represented by an array of log(W) bits. Increase the size of it by 1 (adding 1 meaningful bit), W doubles so run time doubles, hence the exponential time complexity.

橘香 2024-10-16 10:59:28

背包算法的运行时间不仅取决于输入的大小(n - 物品数量),还取决于输入的大小(W - 背包容量) O(nW),它是指数级的在计算机中以二进制 (2^n) 表示。计算复杂性(即如何通过位在计算机内部完成处理)仅与输入的大小有关,而不与输入的大小有关价值观。

暂时忽略价值/重量列表。假设我们有一个背包容量为 2 的实例。W 将在输入数据中占用两位。现在我们将背包容量增加到4,保留其余的输入。我们的输入只增长了一位,但计算复杂度却增加了一倍。如果我们将容量增加到 1024,W 的输入将只有 10 位,而不是 2 位,但复杂性增加了 512 倍。时间复杂度随着二进制(或十进制)表示形式的 W 大小呈指数增长。

另一个帮助我理解伪多项式概念的简单例子是朴素素性测试算法。对于给定的数字 n,我们检查它是否被 2..√n 范围内的每个整数整除,因此该算法需要 √(n−1) 个步骤。但这里,n 是输入的大小,而不是它的大小。

                     Now The regular O(n) case

相比之下,在数组中搜索给定元素需要多项式时间:O(n)。最多需要 n 步,这里 n 是输入的大小(数组的长度)。

[参见此处]

计算存储十进制数所需的位数

The Knapsack algorithm's run-time is bound not only on the size of the input (n - the number of items) but also on the magnitude of the input (W - the knapsack capacity) O(nW) which is exponential in how it is represented in computer in binary (2^n) .The computational complexity (i.e how processing is done inside a computer through bits) is only concerned with the size of the inputs, not their magnitudes/values.

Disregard the value/weight list for a moment. Let's say we have an instance with knapsack capacity 2. W would take two bits in the input data. Now we shall increase the knapsack capacity to 4, keeping the rest of the input. Our input has only grown by one bit, but the computational complexity has increased twofold. If we increase the capacity to 1024, we would have just 10 bits of the input for W instead of 2, but the complexity has increased by a factor of 512. Time complexity grows exponentially in the size of W in binary (or decimal) representation.

Another simple example that helped me understand the pseudo-polynomial concept is the naive primality testing algorithm. For a given number n we are checking if it's divided evenly by each integer number in range 2..√n, so the algorithm takes √(n−1) steps. But here, n is the magnitude of the input, not it's size.

                     Now The regular O(n) case

By contrast, searching an array for a given element runs in polynomial time: O(n). It takes at most n steps and here n is the size of the input (the length of the array).

[ see here ]

Calculating bits required to store decimal number

黑色毁心梦 2024-10-16 10:59:28

复杂性基于输入。在背包问题中,输入是大小、最大容量、利润、重量数组。我们将 dp 表构造为 size * W 因此我们感觉它具有多项式时间复杂度。但是,输入 W 是一个整数不是一个数组。因此,它将是 O(大小*(存储给定 W 所需的位数))。如果没有位增加 1,则运行时间加倍。因此它是指数的,因而是伪多项式的。

Complexity is based on input. In knapsack problem, Inputs are size, max Capacity, and profit, weight arrays. We construct dp table as size * W so we feel as its of polynomial time complexity. But, input W is an integer, not an array. So, it will be O(size*(no Of bits required to store given W)). If no of bits increase by 1, then running time doubles. Thus it is exponential, thereby pseudo-polynomial.

も星光 2024-10-16 10:59:28

时间复杂度定义为输入大小(而不是输入)的函数

Knapsack 的复杂度为 O(nW),这里 n = 硬币数组的大小。
为了表示 W,我们需要 log2W 位。这给我们 O(n * 2^log2W) = O(n * 2^input_size),因此是指数的。

同样的逻辑,硬币变化(nW)也是指数级的。然而,规范硬币的贪婪算法是多项式的。

// assumed the coins are sorted in ascending order.
int CoinChange(const vector<int>& coins, int amount) {
    int coinsUsed = 0;
    for(int i = n-1; i >= 0; i --) {
        coinsUsed += amount / coins[i];
        amount %= coins[i];
    }
    return coinsUsed;
 } 

Time complexity is defined as the function of input size (not the input)

Knapsack has complexity of O(nW), here n = size of the coins array.
To represent W, we need log2W bits. This gives us O(n * 2^log2W) = O(n * 2^input_size), Hence exponential.

With the same logic, coin change (nW) is also exponential. However greedy algorithm with canonical coins is polynomial.

// assumed the coins are sorted in ascending order.
int CoinChange(const vector<int>& coins, int amount) {
    int coinsUsed = 0;
    for(int i = n-1; i >= 0; i --) {
        coinsUsed += amount / coins[i];
        amount %= coins[i];
    }
    return coinsUsed;
 } 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文