当前位置：文江博客话题详情

缺失号码面试问题 Redux

发布于 2024-10-07 09:44:06 字数 1032 浏览 6 评论 0原文

确定 1 到 N 范围内的缺失值的常见面试问题已经被重复了一千次。变化包括 2 个缺失值，最多 K 个缺失值。

示例问题：范围 [1,10] (1 2 4 5 7 8 9 10) = {3,6}

以下是各种解决方案的示例：

简单的面试问题变得更难：给定数字 1..100，找到缺失的数字

我的问题是，将一个缺失值的简单情况视为 O(n) 复杂度，并且较大情况的复杂度收敛于大约大于 O(nlogn) 的值：

这不是更容易回答吗通过对范围进行排序（合并排序）并对其进行迭代来观察丢失的元素来解决问题？

此解决方案的时间不应超过 O(nlogn)，并且能够解决 1 到 N 以外的范围的问题，例如 10 到 1000 或 -100 到 +100 等。对于大量缺失值，

是否有理由相信上述 SO 链接中给出的解决方案会比基于排序的解决方案更好？

注意：这个问题似乎有很多常见的解决方案，都假设仅采用数论方法。如果在 S/E 面试中被问到这样的问题，假设该方法与数论解决方案的复杂性相当，那么使用更计算机科学/算法的方法不是明智的选择吗

？相关链接：

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

静若繁花 2024-10-14 09:44:07

我已经回答了此处

您还可以创建一个大小为last_element_in_the_existing_array + 1的布尔数组。

在 for 循环中，标记现有数组中存在的所有元素 true。

在另一个 for 循环中打印包含 false 的元素的索引，也称为缺失元素。

时间复杂度：O(last_element_in_the_existing_array)

空间复杂度：O(array.length)

回复收藏 0 原文

梦巷 2024-10-14 09:44:07

如果范围提前给你，在这种情况下范围是[1,10]，你可以用你的范围和给你的数字执行异或运算。由于 XOR 是可交换运算。您将剩下 {3,6}

(1 2 3 4 5 6 7 8 9 10) XOR (1 2 4 5 7 8 9 10) ={3,6}

回复收藏 0 原文

毁我热情 2024-10-14 09:44:06

您仅指定时间复杂度，但空间复杂度也很重要。

问题的复杂性可以用 N（范围的长度）和 K（缺失元素的数量）来指定。

在您链接的问题中，使用方程的解在空间中是 O(K) （或者可能更多一点？），因为每个未知值需要一个方程。

还有一个保存点：您可以更改已知元素列表吗？在许多情况下，这是不希望的，在这种情况下，任何涉及对元素重新排序或消耗它们的解决方案都必须首先在空间中制作一个副本，O(NK)。

我无法比线性解决方案更快地看到：您需要读取所有已知元素（NK）并输出所有未知元素（K）。因此，你无法及时获得比 O(N) 更好的结果。

让我们分解解

销毁，O(N) 空间，O(N log N) 时间：就地排序
保存，O(K) 空间？，O(N log N) 时间：方程组
保存，O(N ) 空间，O(N) 时间：计数排序

就个人而言，虽然我发现方程系统解决方案很聪明，但我可能会使用其中任何一种排序解决方案。让我们面对现实吧：它们的编码要简单得多，尤其是计数排序！

就时间而言，在实际执行中，我认为“计数排序”将轻松击败所有其他解决方案。

注意：计数排序不要求范围为[0, X)，任何范围都可以，因为任何有限范围都可以转置为[ 0, X) 通过简单翻译形成。

编辑：

将排序更改为 O(N)，需要拥有所有可用于对它们进行排序的元素。

经过一些时间思考这个问题后，我还提出了另一个解决方案。如前所述，当 N （急剧）增长时，所需的空间可能会爆炸。但是，如果 K 很小，那么我们可以使用间隔来更改列表的表示：

{4, 5, 3, 1, 7}

可以表示为

[1,1] U [3,5] U [7,7]

在一般情况下，维护间隔的排序列表比维护元素的排序列表成本要低得多，而且推断缺失的数字也很容易。

时间复杂度很简单：O(N log N)，毕竟它基本上是一种插入排序。

当然，真正有趣的是，不需要实际存储列表，因此您可以将其通过流提供给算法。

另一方面，我很难计算出平均空间复杂度。 “最终”占用的空间是 O(K)（最多 K+1 个间隔），但在构造过程中，由于我们没有按特定顺序引入元素，因此会出现更多缺失的间隔。

最坏的情况很简单：N/2 个间隔（考虑奇数与偶数）。但我无法弄清楚平均情况。我的直觉告诉我它应该比 O(N) 更好，但我不太相信。

You are only specifying the time complexity, but the space complexity is also important to consider.

The problem complexity can be specified in term of N (the length of the range) and K (the number of missing elements).

In the question you link, the solution of using equations is O(K) in space (or perhaps a bit more ?), as you need one equation per unknown value.

There is also the preservation point: may you alter the list of known elements ? In a number of cases this is undesirable, in which case any solution involving reordering the elements, or consuming them, must first make a copy, O(N-K) in space.

I cannot see faster than a linear solution: you need to read all known elements (N-K) and output all unknown elements (K). Therefore you cannot get better than O(N) in time.

Let us break down the solutions

Destroying, O(N) space, O(N log N) time: in-place sort
Preserving, O(K) space ?, O(N log N) time: equation system
Preserving, O(N) space, O(N) time: counting sort

Personally, though I find the equation system solution clever, I would probably use either of the sorting solutions. Let's face it: they are much simpler to code, especially the counting sort one!

And as far as time goes, in a real execution, I think the "counting sort" would beat all other solutions hands down.

Note: the counting sort does not require the range to be [0, X), any range will do, as any finite range can be transposed to the [0, X) form by a simple translation.

EDIT:

Changed the sort to O(N), one needs to have all the elements available to sort them.

Having had some time to think about the problem, I also have another solution to propose. As noted, when N grows (dramatically) the space required might explode. However, if K is small, then we could change our representation of the list, using intervals:

{4, 5, 3, 1, 7}

can be represented as

[1,1] U [3,5] U [7,7]

In the average case, maintaining a sorted list of intervals is much less costly than maintaining a sorted list of elements, and it's as easy to deduce the missing numbers too.

The time complexity is easy: O(N log N), after all it's basically an insertion sort.

Of course what's really interesting is that there is no need to actually store the list, thus you can feed it with a stream to the algorithm.

On the other hand, I have quite a hard time figuring out the average space complexity. The "final" space occupied is O(K) (at most K+1 intervals), but during the construction there will be much more missing intervals as we introduce the elements in no particular order.

The worst case is easy enough: N/2 intervals (think odd vs even numbers). I cannot however figure out the average case though. My gut feeling is telling me it should be better than O(N), but I am not that trusting.

回复收藏 0 原文

偏爱自由 2024-10-14 09:44:06

理论上给定的解决方案是否优于排序方案取决于 N 和 K。虽然您的解决方案的复杂度为 O(N*log(N))，但给定的解决方案为 O(N *K）。我认为给定的解决方案（与排序解决方案相同）能够通过将范围 [A, B] 转换为 [A, B] 来解决任何范围 [A, B]代码>[1，N]。