在包含整数的数组中，一个值在数组中出现两次。你如何确定是哪一个？

发布于 2024-12-05 14:53:02 字数 493 浏览 3 评论 0原文

假设数组包含 1 到 1,000,000 之间的整数。

我知道解决这个问题的一些流行方法：

如果包含 1 到 1,000,000 之间的所有数字，找到数组元素的总和并从总和中减去它（n*n+1/2）
使用哈希映射（需要额外的内存）
使用位图（更少的内存开销）

我最近遇到了另一个解决方案，我需要一些帮助来理解其背后的逻辑：

保留单个基数累加器。您对累加器进行异或运算索引和该索引处的值。
x ^ C ^ x == C 在这里很有用，因为每个数字都是异或两次，除了其中两次的那个，它将出现 3 次。 (x ^ x ^ x == x) 以及最终索引，将出现一次。因此，如果我们用最终索引作为累加器的种子，则累加器的最终值将是列表中出现两次的数字。

如果有人可以帮助我理解这种方法背后的逻辑（用一个小例子！），我将不胜感激。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花海 2024-12-12 14:53:02

假设您有一个累加器

int accumulator = 0;

，在循环的每一步，您都将累加器与 i 和 v 进行异或，其中 i 是循环的索引迭代，v 是数组第 i 位置的值。

accumulator ^= (i ^ v)

通常，i和v将是相同的数字，所以你最终会做

accumulator ^= (i ^ i)

但是i ^ i == 0，所以这将最终作为无操作，累加器的值将保持不变。此时我应该说，数组中数字的顺序并不重要，因为 XOR 是可交换的，因此即使数组被打乱，最后的结果仍然应该是 0（累加器的初始值）。

现在，如果一个数字在数组中出现两次怎么办？显然，这个数字在异或运算中会出现三次（一次是索引等于数字，一次是数字正常出现，一次是额外出现）。此外，其他数字之一只会出现一次（仅针对其索引）。

该解决方案现在继续假设仅出现一次的数字等于数组的最后一个索引，或者换句话说：数组中的数字范围是连续的，并且从要处理的第一个索引开始（编辑：感谢咖啡馆的提醒评论，这确实是我的想法，但我在写作时完全搞砸了）。以此（N 仅出现一次）为给定，考虑从

int accumulator = N;

有效开始使得 N 在异或运算中再次出现两次。此时，我们只剩下只出现两次的数字，以及出现三次的一个数字。由于出现两次的数字将异或为 0，因此累加器的最终值将等于出现 3 次的数字（即多出 1 个）。

Assume you have an accumulator

int accumulator = 0;

At each step of your loop, you XOR the accumulator with i and v, where i is the index of the loop iteration and v is the value in the ith position of the array.

accumulator ^= (i ^ v)

Normally, i and v will be the same number so you will end up doing

accumulator ^= (i ^ i)

But i ^ i == 0, so this will end up being a no-op and the value of the accumulator will be left untouched. At this point I should say that the order of the numbers in the array does not matter because XOR is commutative, so even if the array is shuffled to begin with the result at the end should still be 0 (the initial value of the accumulator).

Now what if a number occurs twice in the array? Obviously, this number will appear three times in the XORing (one for the index equal to the number, one for the normal appearance of the number, and one for the extra appearance). Furthermore, one of the other numbers will only appear once (only for its index).

This solution now proceeds to assume that the number that only appears once is equal to the last index of the array, or in other words: that the range of numbers in the array is contiguous and starting from the first index to be processed (edit: thanks to caf for this heads-up comment, this is what I had in mind really but I totally messed it up when writing). With this (N appears only once) as a given, consider that starting with

int accumulator = N;

effectively makes N again appear twice in the XORing. At this point, we are left with numbers that only appear exactly twice, and just the one number that appears three times. Since the twice-appearing numbers will XOR out to 0, the final value of the accumulator will be equal to the number that appears three times (i.e. one extra).

回复收藏 0 原文

苏辞 2024-12-12 14:53:02

1 到 10,001 之间的每个数字都显示为数组索引。（C 数组不是从 0 索引吗？好吧，只要我们对数组值和索引是否都从 0 开始还是都从 1 开始保持一致，就没有什么区别。我将选择从 0 开始的数组1，因为这就是问题似乎所说的。）

无论如何，是的，1 到 10,001 之间的每个数字都作为数组索引出现，恰好一次。 1 到 10,000 之间的每个数字也作为数组值恰好出现一次，但出现两次的重复值除外。因此，从数学上讲，我们总体上进行的计算如下：

1 xor 1 xor 2 xor 2 xor 3 xor 3 xor ... xor 10,000 xor 10,000 xor 10,001 xor D

其中 D 是重复值。当然，计算中的项可能不会按该顺序出现，但异或是可交换的，因此我们可以根据需要重新排列项。对于每个 n，n xor n 都是 0。因此，上面的内容简化为

10,001 xor D

与 10,001 进行异或，得到 D，即重复值。

Each number between 1 and 10,001 inclusive appears as an array index. (Aren't C arrays 0-indexed? Well, it doesn't make a difference provided we're consistent about whether the array values and indices both start at 0 or both start at 1. I'll go with the array starting at 1, since that's what the question seems to say.)

Anyway, yes, each number between 1 and 10,001 inclusive appears, precisely once, as an array index. Each number between 1 and 10,000 inclusive also appears as an array value precisely once, with the exception of the duplicated value which occurs twice. So mathematically, the calculation we're doing overall is the following:

1 xor 1 xor 2 xor 2 xor 3 xor 3 xor ... xor 10,000 xor 10,000 xor 10,001 xor D

where D is the duplicated value. Of course, the terms in the calculation probably don't appear in that order, but xor is commutative, so we can rearrange the terms however we like. And n xor n is 0 for each n. So the above simplifies to