判断n个整数数组中是否存在A+B=C
这是我的一个朋友收到的作业(算法和数据结构课)的问题。他问我这件事。然而,我无法解决这个问题,这几天我一直在思考这个问题。
[0, 231-1] 范围内有 n 个随机整数(可能有重复。判断这些数字中是否有 3 个数字满足 A + B = C。
我首先想到了一个简单的算法,即 O(n2log < em>n)。 然后我想出了一个 O(n2) 的算法。这是伪代码:
sort(a); // non-descending
for (i = 0; i < n; i++) {
j = i; k = i + 1;
while (j < n && k < n) {
if (a[i] + a[j] == a[k])
return true;
else if (a[i] + a[k] < a[j])
k++;
else
j++;
}
}
return false;
但是,问题表明 1 << n <= 106。我认为 O(n2) 太慢了。我的解决方案没有利用随机性。但是,我不确定这是否是问题的重要部分。
This is a problem a friend of mine received as his homework (in algorithm and data structure class). He asked me about this. However, I can't solve it and have been thinking about it for some time during the last several days.
There are n random integers in the range [0, 231-1] (there may be duplicates. Determine if 3 numbers of these numbers satisfy A + B = C.
I first came up with a naive algorithm that's O(n2log n).
I then came up with an algorithm that's O(n2). Here is the pseudo code:
sort(a); // non-descending
for (i = 0; i < n; i++) {
j = i; k = i + 1;
while (j < n && k < n) {
if (a[i] + a[j] == a[k])
return true;
else if (a[i] + a[k] < a[j])
k++;
else
j++;
}
}
return false;
However, the problem states that 1 < n <= 106. I believe O(n2) is too slow. My solution doesn't make use of the randomness. However, I'm not sure if this is an important part of the problem.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
一般问题是3SUM-Hard以及是否存在的问题比二次算法更好的算法已经开放。
因此,如果您需要更快的算法,您可能需要利用它们是 32 位的事实。
The general problem is 3SUM-Hard and the question of whether there is a better than quadratic algorithm is open.
So if you need a faster algorithm, you would probably need to make use of the fact that they are 32-bit.
如果数字是随机的,任何最坏情况的
O(n^2)
算法(包括您的算法)都会运行得非常快。事实上,实际复杂度将为O(n*logn)
(排序的复杂度)。这很像快速排序,平均
O(n*logn)
且达到O(n^2)
的机会很小。10^6
随机数为我们提供~ 10^6*10^6
在~ 0..10^9
范围内的“几乎随机”总和。这些10^12
随机和之一等于整数范围内的给定随机值的可能性有多大?相当不错。现在,这些
10^12
随机总和之一等于10^6给定随机值之一的机会有多大? 100%,诗意地说。我已经实现了您提出的解决方案,对于
n = 10^6
,它在最内层循环中平均执行5000-10000
操作。O(n^2)
就这么多了。排序是其中成本最高的操作。附言。如果您更新解决方案以使用散列而不是排序,则可以进一步降低复杂性,甚至使其
O(1)
。PS 2.java测试程序,供参考。运行它并亲自看看。
If numbers are random, any worst-case
O(n^2)
algorithm (including yours) will run very fast. In fact, the practical complexity will beO(n*logn)
(the complexity of sorting).It's much like quicksort, where we have
O(n*logn)
average and a tiny chance of hittingO(n^2)
.10^6
random numbers give us~ 10^6*10^6
'nearly random' sums in range~ 0..10^9
. What's the chance that one of those10^12
random sums will be equal to a given random value in integer range? Pretty good.Now, what's the chance that one of those
10^12
random sums will be equal to a one of 10^6 given random values? 100%, speaking poetically.I've implemented your proposed solution, for
n = 10^6
it performs on average5000-10000
operations in the innermost loop. So much forO(n^2)
. Sorting is the costliest operation in there.PS. You may reduce complexity farther and make it even
O(1)
, if you update the solution to use hash instead of sorting.PS 2. The test program in java, for the reference. Run it and see for yourself.
使用哈希的算法在 Python 中需要 10-900 微秒(平均值:200 中位数:60):
它是
O(N**2)
但看起来它足够快。为了进行比较,创建
frozenset
的分摊O(N)
操作需要270
毫秒(比这慢 1000 倍)搜索)并创建随机列表需要0.9
秒。注意:如果输入序列包含唯一元素,
random.sample
不会返回重复元素,因此frozenset
不会丢弃上例中的任何元素。为了解决允许重复元素的随机序列的问题,我们应该使用两种数据结构:输出
Algorithm that uses hashing takes 10-900 microseconds in Python (average: 200 median: 60):
It is
O(N**2)
but it seems it is fast enough.For comparison, the amortized
O(N)
operation of creating thefrozenset
takes270
milliseconds (1000 times slower then the search) and to create random list it takes0.9
seconds.Note:
random.sample
doesn't return repeated elements if an input sequence contains unique elements thereforefrozenset
doesn't discard any elements in the above example. To solve the problem for a random sequence that allows repeated elements we should use two data structures:Output
在对排序列表进行测量时,我得到了 O(n log n):
这些是结果(大约对每个元素的 bisect 进行一次调用):
有几种优化可以大大减少运行时间(例如跳过等于数字的运行)到已经测试过的那个)。
I'm getting O(n log n) when measuring this over sorted lists:
These are the results (about one call to bisect per element):
There are several optimizations that can considerably reduce the running time (like skipping over runs of numbers equal to the one already tested).
A+B=C,因此
B=CA 或 A=CB
上述问题可以通过使用哈希表以 O(n) 复杂度完成。
希望有帮助。
A+B=C, hence
B=C-A or A=C-B
The above problem can be done in O(n) complexity by using hash table.
Hope that helps.