添加到流中特定值的第一对数字
有一个整数流通过。问题是从流中找到添加到特定值(例如 k)的第一数字对。
对于静态数组,可以使用以下方法之一:
- 方法(1):对数组进行排序,使用两个指向数组开头和结尾的指针并进行比较。
- 方法(2):使用散列,即如果A[i]+A[j]=k,则A[j]=kA[i]。在哈希表中搜索 A[j]。
但这些方法对于流来说都不能很好地扩展。关于有效解决这个问题有什么想法吗?
There are a stream of integers coming through. The problem is to find the first pair of numbers from the stream that adds to a specific value (say, k).
With static arrays, one can use either of the below approaches:
- Approach (1): Sort the array, use two pointers to beginning and end of array and compare.
- Approach (2): Use hashing, i.e. if A[i]+A[j]=k, then A[j]=k-A[i]. Search for A[j] in the hash table.
But neither of these approaches scale well for streams. Any thoughts on efficiently solving this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我相信,如果不使用至少 O(n) 内存,就没有办法做到这一点,其中 n 是出现在第一对总和为 k 之前的元素数量。我假设我们使用的是 RAM 机器,但不是允许可怕的按位黑客攻击的机器(换句话说,我们不能对位打包做任何花哨的事情。)
证明草图如下。假设我们不存储第一对总和为 k 之前出现的所有 n 个元素。然后,当我们看到第 n 个元素,它与先前的某个值相加得到 k 时,我们有可能会丢弃与其配对的前一个元素,因此不会知道 k 的总和已达到。更正式地说,假设对手可以在我们查看前 n - 1 个元素并注意到我们没有存储某些元素 x 时观察我们在内存中存储的值。然后,对手可以将流的下一个元素设置为 k - x,我们会错误地报告尚未达到总和,因为我们不记得看到过 x。
鉴于我们需要存储我们看到的所有元素,而不了解更多有关流中数字的信息,一个非常好的方法是使用包含迄今为止我们看到的所有元素的哈希表。给定一个好的哈希表,这将需要 O(n) 内存和 O(n) 时间来完成。
我不确定如果您对流中的数字种类做出更强的假设,是否有更聪明的策略来解决这个问题,但我相当有信心这在时间和空间方面是渐近理想的。
希望这有帮助!
I believe that there is no way to do this that doesn't use at least O(n) memory, where n is the number of elements that appear before the first pair that sums to k. I'm assuming that we are using a RAM machine, but not a machine that permits awful bitwise hackery (in other words, we can't do anything fancy with bit packing.)
The proof sketch is as follows. Suppose that we don't store all of the n elements that appear before the first pair that sums to k. Then when we see the nth element, which sums with some previous value to get k, there is a chance that we will have discarded the previous element that it pairs with and thus won't know that the sum of k has been reached. More formally, suppose that an adversary could watch what values we were storing in memory as we looked at the first n - 1 elements and noted that we didn't store some element x. Then the adversary could set the next element of the stream to be k - x and we would incorrectly report that the sum had not yet been reached, since we wouldn't remember seeing x.
Given that we need to store all the elements we've seen, without knowing more about the numbers in the stream, a very good approach would be to use a hash table that contains all of the elements we've seen so far. Given a good hash table, this would take expected O(n) memory and O(n) time to complete.
I am not sure whether there is a more clever strategy for solving this problem if you make stronger assumptions about the sorts of numbers in the stream, but I am fairly confident that this is asymptotically ideal in terms of time and space.
Hope this helps!