如何在 O(n) 时间内找到在排序数组中出现奇数次的数字?

发布于 2024-09-08 02:58:51 字数 205 浏览 6 评论 0 原文

我有一个问题,我试图一遍又一遍地思考......但什么也没得到,所以将问题发布在这里。也许我可以得到其他人的一些观点,尝试让它发挥作用......

问题是:我们得到一个排序数组,它由出现偶数次的值的集合组成,除了一个出现的次数奇数次。我们需要在 log n 时间内找到解决方案。

在 O(n) 时间内找到解决方案很容易,但在 log n 时间内执行看起来相当棘手。

I have a question and I tried to think over it again and again... but got nothing so posting the question here. Maybe I could get some view-point of others, to try and make it work...

The question is: we are given a SORTED array, which consists of a collection of values occurring an EVEN number of times, except one, which occurs ODD number of times. We need to find the solution in log n time.

It is easy to find the solution in O(n) time, but it looks pretty tricky to perform in log n time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

念﹏祤嫣 2024-09-15 02:58:51

定理:此问题的每个确定性算法都会探测Ω(log2 n)内存位置最坏的情况。

证明(以更正式的风格完全重写):

令 k > 。 0 为奇数,令 n = k2。我们描述了一个强制 (log2 (k + 1))2 = Ω(log2 n) 探测的对手。

我们将相同元素的最大子序列称为。对手可能的输入由 k 个长度 k x1 x2 … xk 组成。对于每个线段 xj,存在一个整数 bj ∈ [0, k],使得 xj 由 bj 组成 个 j - 1 的副本,然后是 k - bj 个 j 的副本。每组最多重叠两个段,每个段最多重叠两个组。

Group boundaries
|   |     |   |   |
 0 0 1 1 1 2 2 3 3
|     |     |     |
Segment boundaries

每当增加 2 时,我们就按照惯例假定为双边界。

Group boundaries
|     ||       |   |
 0 0 0  2 2 2 2 3 3

声明第 jth 组边界 (1 ≤ j ≤ k) 的位置由线段 xj.

证明:它就在 ((j - 1) k + bj)th 内存位置之后,并且 x j 唯一确定 bj。 //

我们说该算法已观察第 jth 组边界,以防其探测 xj 的结果唯一确定 xj。按照惯例,始终观察输入的开头和结尾。该算法可以在不观察的情况下唯一地确定组边界的位置。

Group boundaries
|   X   |   |     |
 0 0 ? 1 2 2 3 3 3
|     |     |     |
Segment boundaries

仅给出 0 0 ?,算法无法确定是否 ?是 0 或 1。然而,在上下文中, ?必须为 1,否则将出现三个奇数组,并且可以推断 X 处的组边界。这些推论可能对对手来说是有问题的,但事实证明,只有在所讨论的群体边界“无关紧要”之后才能做出这些推论。

声明在算法执行期间的任何给定点,考虑其观察到的组边界集。正好有一个连续的对处于奇数距离,并且奇数组位于它们之间。

证明:每隔一个连续的对仅限制偶数组。 //

定义以特殊连续对为界的奇数长度子序列为相关子序列

声明相关子序列内部没有唯一确定的组边界。如果至少存在一个这样的边界,则奇数组的身份不是唯一确定的。

证明:不失一般性,假设不在相关子序列中的每个内存位置都有已被探测到,并且相关子序列中包含的每个片段恰好有一个未被探测到的位置。假设第 jth 组边界(称为 B)位于相关子序列的内部。根据假设,对 xj 的探测最多可确定 B 的位置最多两个连续的可能性。我们将距左侧观察边界奇数距离的一个称为奇左,将另一个称为奇右。对于这两种可能性,我们从左到右工作并固定每个剩余的内部组边界的位置,以便其左侧的组是偶数。 (我们可以这样做,因为它们也都有两个连续的可能性。)如果 B 位于奇数左侧,则其左侧的组是唯一的奇数组。如果 B 位于奇数右侧,则相关子序列中的最后一组是唯一的奇数组。两者都是有效输入,因此算法既没有唯一确定 B 的位置,也没有唯一确定奇数组的位置。 //

示例:

Observed group boundaries; relevant subsequence marked by […]
[             ]   |
 0 0 Y 1 1 Z 2 3 3
|     |     |     |
Segment boundaries

Possibility #1: Y=0, Z=2
Possibility #2: Y=1, Z=2
Possibility #3: Y=1, Z=1

由于这一主张,算法无论其工作原理如何必须将相关子序列缩小到一组。因此,根据定义,它必须遵守一些群体边界。对手现在的任务很简单,就是保持尽可能多的可能性。

在算法执行期间的任何给定点,对手都会在内部致力于相关子序列之外的每个内存位置的一种可能性。一开始,相关子序列是整个输入,因此没有初始承诺。每当算法探测 xj 的未提交位置时,对手必须提交两个值之一:j - 1 或 j。如果它可以避免让第 jth 边界被观察到,它会选择一个至少留下一半剩余可能性的值(相对于观察)。否则,它会选择将至少一半的组保留在相关间隔内,并为其他组提交值。

这样,对手迫使算法至少观察 log2 (k + 1) 个组边界,并且在观察第 jth 组边界时,算法被迫至少进行 log2 (k + 1) 次探测。


扩展:

通过随机化输入,将“最多减半”(从算法的角度来看)替换为“最多在期望中减半”,该结果直接扩展到随机算法 ,并应用标准浓度不等式。

它还扩展到任何组都不能大于 s 个副本的情况;在这种情况下,下限为Ω(log n log s)

Theorem: Every deterministic algorithm for this problem probes Ω(log2 n) memory locations in the worst case.

Proof (completely rewritten in a more formal style):

Let k > 0 be an odd integer and let n = k2. We describe an adversary that forces (log2 (k + 1))2 = Ω(log2 n) probes.

We call the maximal subsequences of identical elements groups. The adversary's possible inputs consist of k length-k segments x1 x2 … xk. For each segment xj, there exists an integer bj ∈ [0, k] such that xj consists of bj copies of j - 1 followed by k - bj copies of j. Each group overlaps at most two segments, and each segment overlaps at most two groups.

Group boundaries
|   |     |   |   |
 0 0 1 1 1 2 2 3 3
|     |     |     |
Segment boundaries

Wherever there is an increase of two, we assume a double boundary by convention.

Group boundaries
|     ||       |   |
 0 0 0  2 2 2 2 3 3

Claim: The location of the jth group boundary (1 ≤ j ≤ k) is uniquely determined by the segment xj.

Proof: It's just after the ((j - 1) k + bj)th memory location, and xj uniquely determines bj. //

We say that the algorithm has observed the jth group boundary in case the results of its probes of xj uniquely determine xj. By convention, the beginning and the end of the input are always observed. It is possible for the algorithm to uniquely determine the location of a group boundary without observing it.

Group boundaries
|   X   |   |     |
 0 0 ? 1 2 2 3 3 3
|     |     |     |
Segment boundaries

Given only 0 0 ?, the algorithm cannot tell for sure whether ? is a 0 or a 1. In context, however, ? must be a 1, as otherwise there would be three odd groups, and the group boundary at X can be inferred. These inferences could be problematic for the adversary, but it turns out that they can be made only after the group boundary in question is "irrelevant".

Claim: At any given point during the algorithm's execution, consider the set of group boundaries that it has observed. Exactly one consecutive pair is at odd distance, and the odd group lies between them.

Proof: Every other consecutive pair bounds only even groups. //

Define the odd-length subsequence bounded by the special consecutive pair to be the relevant subsequence.

Claim: No group boundary in the interior of the relevant subsequence is uniquely determined. If there is at least one such boundary, then the identity of the odd group is not uniquely determined.

Proof: Without loss of generality, assume that each memory location not in the relevant subsequence has been probed and that each segment contained in the relevant subsequence has exactly one location that has not been probed. Suppose that the jth group boundary (call it B) lies in the interior of the relevant subsequence. By hypothesis, the probes to xj determine B's location up to two consecutive possibilities. We call the one at odd distance from the left observed boundary odd-left and the other odd-right. For both possibilities, we work left to right and fix the location of every remaining interior group boundary so that the group to its left is even. (We can do this because they each have two consecutive possibilities as well.) If B is at odd-left, then the group to its left is the unique odd group. If B is at odd-right, then the last group in the relevant subsequence is the unique odd group. Both are valid inputs, so the algorithm has uniquely determined neither the location of B nor the odd group. //

Example:

Observed group boundaries; relevant subsequence marked by […]
[             ]   |
 0 0 Y 1 1 Z 2 3 3
|     |     |     |
Segment boundaries

Possibility #1: Y=0, Z=2
Possibility #2: Y=1, Z=2
Possibility #3: Y=1, Z=1

As a consequence of this claim, the algorithm, regardless of how it works, must narrow the relevant subsequence to one group. By definition, it therefore must observe some group boundaries. The adversary now has the simple task of keeping open as many possibilities as it can.

At any given point during the algorithm's execution, the adversary is internally committed to one possibility for each memory location outside of the relevant subsequence. At the beginning, the relevant subsequence is the entire input, so there are no initial commitments. Whenever the algorithm probes an uncommitted location of xj, the adversary must commit to one of two values: j - 1, or j. If it can avoid letting the jth boundary be observed, it chooses a value that leaves at least half of the remaining possibilities (with respect to observation). Otherwise, it chooses so as to keep at least half of the groups in the relevant interval and commits values for the others.

In this way, the adversary forces the algorithm to observe at least log2 (k + 1) group boundaries, and in observing the jth group boundary, the algorithm is forced to make at least log2 (k + 1) probes.


Extensions:

This result extends straightforwardly to randomized algorithms by randomizing the input, replacing "at best halved" (from the algorithm's point of view) with "at best halved in expectation", and applying standard concentration inequalities.

It also extends to the case where no group can be larger than s copies; in this case the lower bound is Ω(log n log s).

花伊自在美 2024-09-15 02:58:51

排序数组建议采用二分搜索。我们必须重新定义平等和比较。简单的相等意味着元素数量为奇数。我们可以通过观察组中第一个或最后一个元素的索引来进行比较。第一个元素将是奇数组之前的偶数索引(从 0 开始),以及奇数组之后的奇数索引。我们可以使用二分搜索找到组的第一个和最后一个元素。总成本为 O((log N)²)。

证明 O((log N)²)

  T(2) = 1 //to make the summation nice
  T(N) = log(N) + T(N/2) //log(N) is finding the first/last elements

对于某些 N=2^k,

T(2^k) = (log 2^k) + T(2^(k-1))
       = (log 2^k) + (log 2^(k-1)) + T(2^(k-2))
       = (log 2^k) + (log 2^(k-1)) + (log 2^(k-2)) + ... + (log 2^2) + 1
       = k + (k-1) + (k-2) + ... + 1
       = k(k+1)/2
       = (k² + k)/2
       = (log(N)² + log(N))/ 2
       = O(log(N)²)

A sorted array suggests a binary search. We have to redefine equality and comparison. Equality simple means an odd number of elements. We can do comparison by observing the index of the first or last element of the group. The first element will be an even index (0-based) before the odd group, and an odd index after the odd group. We can find the first and last elements of a group using binary search. The total cost is O((log N)²).

PROOF OF O((log N)²)

  T(2) = 1 //to make the summation nice
  T(N) = log(N) + T(N/2) //log(N) is finding the first/last elements

For some N=2^k,

T(2^k) = (log 2^k) + T(2^(k-1))
       = (log 2^k) + (log 2^(k-1)) + T(2^(k-2))
       = (log 2^k) + (log 2^(k-1)) + (log 2^(k-2)) + ... + (log 2^2) + 1
       = k + (k-1) + (k-2) + ... + 1
       = k(k+1)/2
       = (k² + k)/2
       = (log(N)² + log(N))/ 2
       = O(log(N)²)
疑心病 2024-09-15 02:58:51

查看数组的中间元素。通过几次适当的二分搜索,您可以找到数组中的第一个和最后一个出现。例如,如果中间元素是'a',则需要找到ij,如下所示:

[* * * * a a a a * * *]
         ^     ^ 
         |     |
         |     |
         i     j

j - i是偶数吗?你完成了!否则(这是这里的关键),要问的问题i是偶数还是奇数?你明白这段知识意味着什么吗?然后剩下的就很容易了。

Look at the middle element of the array. With a couple of appropriate binary searches, you can find the first and its last appearance in the array. E.g., if the middle element is 'a', you need to find i and j as shown below:

[* * * * a a a a * * *]
         ^     ^ 
         |     |
         |     |
         i     j

Is j - i an even number? You are done! Otherwise (and this is the key here), the question to ask is i an even or an odd number? Do you see what this piece of knowledge implies? Then the rest is easy.

若能看破又如何 2024-09-15 02:58:51

此答案支持“throtawayacct”发布的答案。他应得的赏金。我在这个问题上花了一些时间,我完全相信他的证明是正确的,您需要 Ω(log(n)^2) 查询来查找出现奇数次的数字。我对此深信不疑,因为在浏览了他的解决方案后,我最终重新创建了完全相同的论点。

在该解决方案中,对手创建一个输入,使算法变得困难,但对人类分析人员来说却很简单。输入由 k 个页面组成,每个页面有 k 个条目。条目总数为 n = k^2,并且重要的是 O(log(k)) = O(log(n)) 和 Ω(log(k)) = Ω(log(n))。为了进行输入,攻击者制作一个长度为 k、形式为 00...011...1 的字符串,并在任意位置进行转换。然后字符串中的每个符号被扩展为长度为 k 的页面,其形式为 aa...abb...b,其中在第 i 页上,a=i 且 b=i+1。每个页面上的转换也可以处于任意位置,除了奇偶校验与页面扩展的符号一致之外。

了解分析算法最坏情况的“对手方法”非常重要。对手回答有关算法输入的查询,但不承诺未来的答案。答案必须一致,当对手被足够确定以使算法得出结论时,游戏就结束了。

在此背景下,以下是一些观察结果:

1) 如果您想通过在页面中进行查询来了解页面中转换的奇偶性,则必须了解转换的确切位置,并且需要 Ω(log(k) )查询。任何查询集合都将转换点限制为一个间隔,并且任何长度大于 1 的间隔都具有两个奇偶校验。对该页面中的转换最有效的搜索是二分搜索。

2) 最微妙和最重要的一点:有两种方法可以确定特定页面内转换的奇偶性。您可以在该页面中进行足够的查询来查找转换,或者如果您在较早的页面和较晚的页面中发现相同的奇偶校验,则可以推断奇偶校验。这种非此即彼的情况是无法避免的。任何一组查询都会将每个页面中的转换点限制在某个间隔内。对奇偶性的唯一限制来自长度为 1 的间隔。否则,过渡点可以自由摆动以获得任何一致的奇偶性。

3)在对手方法中,没有幸运的打击。例如,假设您在某个页面中的第一个查询是朝向一端而不是中间。由于对手尚未承诺给出答案,因此他可以自由地将过渡置于长边。

4) 最终结果是您被迫直接探测 Ω(log(k)) 页中的奇偶校验,并且每个子问题的工作量也是 Ω(log(k))。

5)随机选择的情况并不比对抗性选择好多少。数学更加复杂,因为现在您可以获得部分统计信息,而不是严格的“是”,您知道奇偶校验或“否”,您不知道。但这没什么区别。例如,您可以给每个页面长度 k^2,这样每个页面中的第一个 log(k) 查询很有可能告诉您该页面中的奇偶校验几乎没有任何信息。对手可以在一开始就做出随机选择,但它仍然有效。

This answer is in support of the answer posted by "throwawayacct". He deserves the bounty. I spent some time on this question and I'm totally convinced that his proof is correct that you need Ω(log(n)^2) queries to find the number that occurs an odd number of times. I'm convinced because I ended up recreating the exact same argument after only skimming his solution.

In the solution, an adversary creates an input to make life hard for the algorithm, but also simple for a human analyzer. The input consists of k pages that each have k entries. The total number of entries is n = k^2, and it is important that O(log(k)) = O(log(n)) and Ω(log(k)) = Ω(log(n)). To make the input, the adversary makes a string of length k of the form 00...011...1, with the transition in an arbitrary position. Then each symbol in the string is expanded into a page of length k of the form aa...abb...b, where on the ith page, a=i and b=i+1. The transition on each page is also in an arbitrary position, except that the parity agrees with the symbol that the page was expanded from.

It is important to understand the "adversary method" of analyzing an algorithm's worst case. The adversary answers queries about the algorithm's input, without committing to future answers. The answers have to be consistent, and the game is over when the adversary has been pinned down enough for the algorithm to reach a conclusion.

With that background, here are some observations:

1) If you want to learn the parity of a transition in a page by making queries in that page, you have to learn the exact position of the transition and you need Ω(log(k)) queries. Any collection of queries restricts the transition point to an interval, and any interval of length more than 1 has both parities. The most efficient search for the transition in that page is a binary search.

2) The most subtle and most important point: There are two ways to determine the parity of a transition inside a specific page. You can either make enough queries in that page to find the transition, or you can infer the parity if you find the same parity in both an earlier and a later page. There is no escape from this either-or. Any set of queries restricts the transition point in each page to some interval. The only restriction on parities comes from intervals of length 1. Otherwise the transition points are free to wiggle to have any consistent parities.

3) In the adversary method, there are no lucky strikes. For instance, suppose that your first query in some page is toward one end instead of in the middle. Since the adversary hasn't committed to an answer, he's free to put the transition on the long side.

4) The end result is that you are forced to directly probe the parities in Ω(log(k)) pages, and the work for each of these subproblems is also Ω(log(k)).

5) Things are not much better with random choices than with adversarial choices. The math is more complicated, because now you can get partial statistical information, rather than a strict yes you know a parity or no you don't know it. But it makes little difference. For instance, you can give each page length k^2, so that with high probability, the first log(k) queries in each page tell you almost nothing about the parity in that page. The adversary can make random choices at the beginning and it still works.

夜无邪 2024-09-15 02:58:51

从数组的中间开始向后走,直到得到与中心的值不同的值。检查该边界上方的数字是否处于奇数或偶数索引。如果是奇数,则出现奇数次的数字位于左侧,因此请在开头和找到的边界之间重复搜索。如果是偶数,则出现奇数次的数字一定位于数组的后面,因此在右半部分重复搜索。

如前所述,它既有对数分量又有线性分量。如果您想让整个过程保持对数,而不是仅仅向后遍历数组到不同的值,您需要使用二分搜索。除非您期望相同数字出现多次重复,否则二分搜索可能不值得。

Start at the middle of the array and walk backward until you get to a value that's different from the one at the center. Check whether the number above that boundary is at an odd or even index. If it's odd, then the number occurring an odd number of times is to the left, so repeat your search between the beginning and the boundary you found. If it's even, then the number occurring an odd number of times must be later in the array, so repeat the search in the right half.

As stated, this has both a logarithmic and a linear component. If you want to keep the whole thing logarithmic, instead of just walking backward through the array to a different value, you want to use a binary search instead. Unless you expect many repetitions of the same numbers, the binary search may not be worthwhile though.

他是夢罘是命 2024-09-15 02:58:51

我有一个适用于 log(N/C)*log(K) 的算法,其中 K 是最大同值范围的长度,C 是正在搜索的范围的长度。

该算法与之前发布的大多数算法的主要区别在于,它利用了所有相同值范围都很短的情况。它不是通过对整个数组进行二分搜索来找到边界,而是首先通过跳回 1, 2, 4, 8, ...(log(K) 迭代)步快速找到粗略估计,然后对数组进行二分搜索结果范围(再次为 log(K))。

算法如下(用C#编写):

// Finds the start of the range of equal numbers containing the index "index", 
// which is assumed to be inside the array
// 
// Complexity is O(log(K)) with K being the length of range
static int findRangeStart (int[] arr, int index)
{
    int candidate = index;
    int value = arr[index];
    int step = 1;

    // find the boundary for binary search:
    while(candidate>=0 && arr[candidate] == value)
    {
        candidate -= step;
        step *= 2;
    }

    // binary search:
    int a = Math.Max(0,candidate);
    int b = candidate+step/2;
    while(a+1!=b)
    {
        int c = (a+b)/2;
        if(arr[c] == value)
            b = c;
        else
            a = c;
    }
    return b;
}

// Finds the index after the only "odd" range of equal numbers in the array.
// The result should be in the range (start; end]
// The "end" is considered to always be the end of some equal number range.
static int search(int[] arr, int start, int end)
{
    if(arr[start] == arr[end-1])
        return end;

    int middle = (start+end)/2;

    int rangeStart = findRangeStart(arr,middle);

    if((rangeStart & 1) == 0)
        return search(arr, middle, end);
    return search(arr, start, rangeStart);
}

// Finds the index after the only "odd" range of equal numbers in the array
static int search(int[] arr)
{
    return search(arr, 0, arr.Length);
}

I have an algorithm which works in log(N/C)*log(K), where K is the length of maximum same-value range, and C is the length of range being searched for.

The main difference of this algorithm from most posted before is that it takes advantage of the case where all same-value ranges are short. It finds boundaries not by binary-searching the entire array, but by first quickly finding a rough estimate by jumping back by 1, 2, 4, 8, ... (log(K) iterations) steps, and then binary-searching the resulting range (log(K) again).

The algorithm is as follows (written in C#):

// Finds the start of the range of equal numbers containing the index "index", 
// which is assumed to be inside the array
// 
// Complexity is O(log(K)) with K being the length of range
static int findRangeStart (int[] arr, int index)
{
    int candidate = index;
    int value = arr[index];
    int step = 1;

    // find the boundary for binary search:
    while(candidate>=0 && arr[candidate] == value)
    {
        candidate -= step;
        step *= 2;
    }

    // binary search:
    int a = Math.Max(0,candidate);
    int b = candidate+step/2;
    while(a+1!=b)
    {
        int c = (a+b)/2;
        if(arr[c] == value)
            b = c;
        else
            a = c;
    }
    return b;
}

// Finds the index after the only "odd" range of equal numbers in the array.
// The result should be in the range (start; end]
// The "end" is considered to always be the end of some equal number range.
static int search(int[] arr, int start, int end)
{
    if(arr[start] == arr[end-1])
        return end;

    int middle = (start+end)/2;

    int rangeStart = findRangeStart(arr,middle);

    if((rangeStart & 1) == 0)
        return search(arr, middle, end);
    return search(arr, start, rangeStart);
}

// Finds the index after the only "odd" range of equal numbers in the array
static int search(int[] arr)
{
    return search(arr, 0, arr.Length);
}
油饼 2024-09-15 02:58:51

取中间元素e。使用二分查找查找第一个和最后一个出现的位置。 O(log(n))
如果是奇数则返回 e。
否则,递归到具有奇数个元素的一侧 [....]eeee[....]

运行时间将为 log(n) + log(n/2) + log(n/4)... . = O(log(n)^2)。

Take the middle element e. Use binary search to find the first and last occurrence. O(log(n))
If it is odd return e.
Otherwise, recurse onto the side that has an odd number of elements [....]eeee[....]

Runtime will be log(n) + log(n/2) + log(n/4).... = O(log(n)^2).

丶视觉 2024-09-15 02:58:51

啊啊。有一个答案。

执行二分搜索,并在搜索时向后移动每个值,直到找到具有相同值的第一个条目。如果它的索引是偶数,则它位于奇数之前,因此向右移动。
如果它的数组索引是奇数,则它位于奇数之后,因此向左移动。

在伪代码中(这是一般思想,未经测试......):

    private static int FindOddBall(int[] ary)
    {
        int l = 0,
            r = ary.Length - 1;
        int n = (l+r)/2;
        while (r > l+2)
        {
            n = (l + r) / 2;
            while (ary[n] == ary[n-1])
                n = FindBreakIndex(ary, l, n);
            if (n % 2 == 0) // even index we are on or to the left of the oddball 
                l = n;
            else            // odd index we are to the right of the oddball
                r = n-1;
        }
        return ary[l];
    }
    private static int FindBreakIndex(int[] ary, int l, int n)
    {
        var t = ary[n];
        var r = n;
        while(ary[n] != t || ary[n] == ary[n-1])
            if(ary[n] == t)
            {
                r = n;
                n = (l + r)/2;
            }
            else
            {
                l = n;
                n = (l + r)/2;
            }
        return n;
    }

AHhh. There is an answer.

Do a binary search and as you search, for each value, move backwards until you find the first entry with that same value. If its index is even, it is before the oddball, so move to the right.
If its array index is odd, it is after the oddball, so move to the left.

In pseudocode (this is the general idea, not tested...):

    private static int FindOddBall(int[] ary)
    {
        int l = 0,
            r = ary.Length - 1;
        int n = (l+r)/2;
        while (r > l+2)
        {
            n = (l + r) / 2;
            while (ary[n] == ary[n-1])
                n = FindBreakIndex(ary, l, n);
            if (n % 2 == 0) // even index we are on or to the left of the oddball 
                l = n;
            else            // odd index we are to the right of the oddball
                r = n-1;
        }
        return ary[l];
    }
    private static int FindBreakIndex(int[] ary, int l, int n)
    {
        var t = ary[n];
        var r = n;
        while(ary[n] != t || ary[n] == ary[n-1])
            if(ary[n] == t)
            {
                r = n;
                n = (l + r)/2;
            }
            else
            {
                l = n;
                n = (l + r)/2;
            }
        return n;
    }
桃酥萝莉 2024-09-15 02:58:51

您可以使用此算法:

int GetSpecialOne(int[] array, int length)
{
   int specialOne = array[0];

   for(int i=1; i < length; i++)
   {
      specialOne ^= array[i];
   }
   return specialOne;
}

在类似问题的帮助下解决,可以找到这里 http://www.technicalinterviewquestions.net

You can use this algorithm:

int GetSpecialOne(int[] array, int length)
{
   int specialOne = array[0];

   for(int i=1; i < length; i++)
   {
      specialOne ^= array[i];
   }
   return specialOne;
}

Solved with the help of a similar question which can be found here on http://www.technicalinterviewquestions.net

驱逐舰岛风号 2024-09-15 02:58:51

我们没有关于数组内部长度分布以及整个数组的长度分布的任何信息,对吧?

因此 arraylength 可能是 1, 11, 101, 1001 或其他,至少 1 没有上限,并且必须包含至少 1 种类型的元素 ('number') 最多 (length-1)/2 + 1 个元素,对于总大小 1、11、101:1、1 到 6、1 到 51 个元素,依此类推。

我们是否应该假设每种可能的概率大小都相等?这会导致子数组的长度为中等长度/4,不是吗?

大小为 5 的数组可以分为 1、2 或 3 个子列表。

如果我们深入细节,看似显而易见的事情其实并不那么明显。

大小为 5 的数组可以通过一种方式“划分”为一个子列表,称其为“划分”的说法有争议。它只是一个包含 5 个元素的列表 (aaaaa)。为了避免混淆,我们假设列表中的元素是有序字符,而不是数字(a,b,c,...)。

分为两个子列表,它们可能是 (1, 4), (2, 3), (3, 2), (4, 1)。 (abbbb、aabbb、aaabb、aaaab)。

现在让我们回顾一下之前提出的主张:“除法”(5) 是否应该假设与 4 个除法为 2 个子列表的概率相同?或者我们应该将它们混合在一起,并假设每个分区的概率均等(1/5)?

或者我们可以在不知道子列表长度概率的情况下计算解决方案吗?

We don't have any information about the distribution of lenghts inside the array, and of the array as a whole, right?

So the arraylength might be 1, 11, 101, 1001 or something, 1 at least with no upper bound, and must contain at least 1 type of elements ('number') up to (length-1)/2 + 1 elements, for total sizes of 1, 11, 101: 1, 1 to 6, 1 to 51 elements and so on.

Shall we assume every possible size of equal probability? This would lead to a middle length of subarrays of size/4, wouldn't it?

An array of size 5 could be divided into 1, 2 or 3 sublists.

What seems to be obvious is not that obvious, if we go into details.

An array of size 5 can be 'divided' into one sublist in just one way, with arguable right to call it 'dividing'. It's just a list of 5 elements (aaaaa). To avoid confusion let's assume the elements inside the list to be ordered characters, not numbers (a,b,c, ...).

Divided into two sublist, they might be (1, 4), (2, 3), (3, 2), (4, 1). (abbbb, aabbb, aaabb, aaaab).

Now let's look back at the claim made before: Shall the 'division' (5) be assumed the same probability as those 4 divisions into 2 sublists? Or shall we mix them together, and assume every partition as evenly probable, (1/5)?

Or can we calculate the solution without knowing the probability of the length of the sublists?

尘曦 2024-09-15 02:58:51

线索是您正在寻找 log(n)。那小于n。

一次一个地遍历整个阵列?那是n。那是行不通的。

我们知道数组中的前两个索引(0 和 1)应该是相同的数字。与 50 和 51 相同,如果数组中的奇数位于它们之后。

因此,找到数组中的中间元素,将其与其后面的元素进行比较。如果数字的变化发生在错误的索引上,我们就知道数组中的奇数在它之前;否则,就是之后了。通过一组比较,我们可以找出目标位于数组的哪一半。

从那里继续。

The clue is you're looking for log(n). That's less than n.

Stepping through the entire array, one at a time? That's n. That's not going to work.

We know the first two indexes in the array (0 and 1) should be the same number. Same with 50 and 51, if the odd number in the array is after them.

So find the middle element in the array, compare it to the element right after it. If the change in numbers happens on the wrong index, we know the odd number in the array is before it; otherwise, it's after. With one set of comparisons, we figure out which half of the array the target is in.

Keep going from there.

余厌 2024-09-15 02:58:51

使用哈希表

For each element E in the input set
    if E is set in the hash table
         increment it's value
    else        
         set E in the hash table and initialize it to 0

For each key K in hash table
    if K % 2 = 1
        return K

由于该算法是 2n 它属于 O(n)

Use a hash table

For each element E in the input set
    if E is set in the hash table
         increment it's value
    else        
         set E in the hash table and initialize it to 0

For each key K in hash table
    if K % 2 = 1
        return K

As this algorithm is 2n it belongs to O(n)

独木成林 2024-09-15 02:58:51

试试这个:

int getOddOccurrence(int ar[], int ar_size)
{
     int i;
     int xor = 0; 
     for (i=0; i < ar_size; i++)     
        xor = xor ^ ar[i];

     return res;
}

每次与相同数字进行异或时,异或都会抵消,因此 1^1=0 但 1^1^1=1,因此每一对都应该抵消,留下奇数。

Try this:

int getOddOccurrence(int ar[], int ar_size)
{
     int i;
     int xor = 0; 
     for (i=0; i < ar_size; i++)     
        xor = xor ^ ar[i];

     return res;
}

XOR will cancel out everytime you XOR with the same number so 1^1=0 but 1^1^1=1 so every pair should cancel out leaving the odd number out.

揽清风入怀 2024-09-15 02:58:51

假设索引从 0 开始。二分查找最小偶数 i,使得 x[i] != x[i+1];你的答案是x[i]。

编辑:由于公众要求,这里是代码

int f(int *x, int min, int max) {
  int size = max;
  min /= 2;
  max /= 2;
  while (min < max) {
    int i = (min + max)/2;
    if (i==0 || x[2*i-1] == x[2*i])
      min = i+1;
    else
      max = i-1;
  }
  if (2*max == size || x[2*max] != x[2*max+1])
    return x[2*max];
  return x[2*min];
}

Assume indexing start at 0. Binary search for the smallest even i such that x[i] != x[i+1]; your answer is x[i].

edit: due to public demand, here is the code

int f(int *x, int min, int max) {
  int size = max;
  min /= 2;
  max /= 2;
  while (min < max) {
    int i = (min + max)/2;
    if (i==0 || x[2*i-1] == x[2*i])
      min = i+1;
    else
      max = i-1;
  }
  if (2*max == size || x[2*max] != x[2*max+1])
    return x[2*max];
  return x[2*min];
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文