从多个数组之一中查找最小和最大元素

发布于 2024-11-09 06:57:00 字数 291 浏览 0 评论 0原文

我在亚马逊面试期间收到一个问题,希望获得解决该问题的帮助。

给定 N 个大小为 K 的数组,N 个数组中的每个 K 个元素都已排序,并且每个 N*K 个元素都是唯一的。从 N 个数组中的每一个以及所选的 N 个元素子集中选择一个元素。减去最小和最大元素。这种差异应该是尽可能最小的。

示例:

N=3, K=3

N=1 : 6, 16, 67
N=2 : 11,17,68
N=3 : 10, 15, 100

如果选择 16、17、15,我们得到的最小差异为 17-15=2。

I received a question during an Amazon interview and would like assistance with solving it.

Given N arrays of size K each, each of these K elements in the N arrays are sorted, and each of these N*K elements are unique. Choose a single element from each of the N arrays, from the chosen subset of N elements. Subtract the minimum and maximum element. This difference should be the least possible minimum.

Sample:

N=3, K=3

N=1 : 6, 16, 67
N=2 : 11,17,68
N=3 : 10, 15, 100

here if 16, 17, 15 are chosen, we get the minimum difference as
17-15=2.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

酒解孤独 2024-11-16 06:57:00

我可以想到 O(N*K*N)(在 zivo 正确指出后进行编辑,现在不是一个好的解决方案:( )解决方案。
1.取N个指针,最初指向N个数组中每个数组的初始元素。

6, 16, 67
^ 
11,17,68
^
10, 15, 100
^ 

2、找出当前指针O(k)(6和11)中最高和最低的元素,并求它们之间的差。(5)
3. 将指向该数组中最低元素的指针加 1。

 6, 16, 67
    ^ 
 11,17,68
 ^
 10, 15, 100 (difference:5)
 ^ 

4. 继续重复步骤2和3并存储最小差异。

 6, 16, 67
    ^ 
 11,17,68
 ^
 10,15,100 (difference:5)
    ^ 


 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:2)
    ^ 

以上将是所需的解决方案。

 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:84)
       ^ 

 6, 16, 67
        ^ 
 11,17,68
    ^
 10,15,100 (difference:83)
       ^ 

依此类推......

编辑:

可以通过使用堆来降低其复杂性(如 Uri 建议)。我想到了,但遇到了一个问题:每次从堆中提取一个元素时,都必须找出它的数组号,以便增加该数组对应的指针。 一种有效的查找数组编号的方法绝对可以将复杂度降低到 O(K*N log(K*N))。一种简单的方法是使用这样的数据结构

Struct
{
    int element;
    int arraynumer;
}

并重建初始数据,例如

 6|0,16|0,67|0

 11|1,17|1,68|1

 10|2,15|2,100|2

最初保留第一列的当前最大值并将指向的元素插入堆中。现在每次提取一个元素时,都可以找到它的数组号,该数组中的指针会递增,新指向的元素可以与当前的 max 进行比较,并且可以相应地调整 max 指针。

I can think of O(N*K*N)(edited after correctly pointed out by zivo, not a good solution now :( ) solution.
1. Take N pointer initially pointing to initial element each of N arrays.

6, 16, 67
^ 
11,17,68
^
10, 15, 100
^ 

2. Find out the highest and lowest element among the current pointer O(k) (6 and 11) and find the difference between them.(5)
3. Increment the pointer which is pointing to lowest element by 1 in that array.

 6, 16, 67
    ^ 
 11,17,68
 ^
 10, 15, 100 (difference:5)
 ^ 

4. Keep repeating step 2 and 3 and store the minimum difference.

 6, 16, 67
    ^ 
 11,17,68
 ^
 10,15,100 (difference:5)
    ^ 


 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:2)
    ^ 

Above will be the required solution.

 6, 16, 67
    ^ 
 11,17,68
    ^
 10,15,100 (difference:84)
       ^ 

 6, 16, 67
        ^ 
 11,17,68
    ^
 10,15,100 (difference:83)
       ^ 

And so on......

EDIT:

Its complexity can be reduced by using a heap (as suggested by Uri). I thought of it but faced a problem: Each time an element is extracted from heap, its array number has to be found out in order to increment the corresponding pointer for that array. An efficient way to find array number can definitely reduce the complexity to O(K*N log(K*N)). One naive way is to use a data structure like this

Struct
{
    int element;
    int arraynumer;
}

and reconstruct the initial data like

 6|0,16|0,67|0

 11|1,17|1,68|1

 10|2,15|2,100|2

Initially keep the current max for first column and insert the pointed elements in heap. Now each time an element is extracted, its array number can be found out, pointer in that array is incremented , the newly pointed element can be compared to current max and max pointer can be adjusted accordingly.

蓝戈者 2024-11-16 06:57:00

因此,这里有一个算法可以分两步解决这个问题:

第一步是将所有数组合并到一个排序数组中,如下所示:

combined_val[] - 其中包含所有数字
组合_ind[] - 保存该数字最初属于

此步骤的哪个数组的索引可以在 O(K*N*log(N)) 中轻松完成,但我认为你也可以做得更好(也许不是,你可以合并排序的查找变体,因为它们的步骤与此类似)

现在第二步:

只放置代码而不是解释会更容易,所以这里是伪代码:


int count[N] = { 0 }
int head = 0;
int diffcnt = 0;
// mindiff is initialized to overall maximum value - overall minimum value
int mindiff = combined_val[N * K - 1] - combined_val[0];
for (int i = 0; i < N * K; i++) 
{
  count[combined_ind[i]]++;

  if (count[combined_ind[i]] == 1) {
    // diffcnt counts how many arrays have at least one element between
    // indexes of "head" and "i". Once diffcnt reaches N it will stay N and
    // not increase anymore
    diffcnt++;
  } else {
    while (count[combined_ind[head]] > 1) {
      // We try to move head index as forward as possible while keeping diffcnt constant.
      // i.e. if count[combined_ind[head]] is 1, then if we would move head forward
      // diffcnt would decrease, that is something we dont want to do.
      count[combined_ind[head]]--;
      head++;
    }
  }

  if (diffcnt == N) {
    // i.e. we got at least one element from all arrays
    if (combined_val[i] - combined_val[head] < mindiff) {
      mindiff = combined_val[i] - combined_val[head];
      // if you want to save actual numbers too, you can save this (i.e. i and head
      // and then extract data from that)
    }
  }
}

结果在 Mindiff 中。

第二步的运行时间为O(N * K)。这是因为“head”索引最多只会移动 N*K 次。所以内循环不会使这个二次,它仍然是线性的。

所以总的算法运行时间是 O(N * K * log(N)),但是这是因为合并步骤,如果你能想出更好的合并步骤,你可能可以将其降低到 O(N * K)。

So here is an algorithm to do solve this problem in two steps:

First step is to merge all your arrays into one sorted array which would look like this:

combined_val[] - which holds all numbers
combined_ind[] - which holds index of which array did this number originally belonged to

this step can be done easily in O(K*N*log(N)) but i think you can do better than that too (maybe not, you can lookup variants of merge sort because they do step similar to that)

Now second step:

it is easier to just put code instead of explaining so here is the pseduocode:


int count[N] = { 0 }
int head = 0;
int diffcnt = 0;
// mindiff is initialized to overall maximum value - overall minimum value
int mindiff = combined_val[N * K - 1] - combined_val[0];
for (int i = 0; i < N * K; i++) 
{
  count[combined_ind[i]]++;

  if (count[combined_ind[i]] == 1) {
    // diffcnt counts how many arrays have at least one element between
    // indexes of "head" and "i". Once diffcnt reaches N it will stay N and
    // not increase anymore
    diffcnt++;
  } else {
    while (count[combined_ind[head]] > 1) {
      // We try to move head index as forward as possible while keeping diffcnt constant.
      // i.e. if count[combined_ind[head]] is 1, then if we would move head forward
      // diffcnt would decrease, that is something we dont want to do.
      count[combined_ind[head]]--;
      head++;
    }
  }

  if (diffcnt == N) {
    // i.e. we got at least one element from all arrays
    if (combined_val[i] - combined_val[head] < mindiff) {
      mindiff = combined_val[i] - combined_val[head];
      // if you want to save actual numbers too, you can save this (i.e. i and head
      // and then extract data from that)
    }
  }
}

the result is in mindiff.

The runing time of second step is O(N * K). This is because "head" index will move only N*K times maximum. so the inner loop does not make this quadratic, it is still linear.

So total algorithm running time is O(N * K * log(N)), however this is because of merging step, if you can come up with better merging step you can probably bring it down to O(N * K).

御守 2024-11-16 06:57:00

这个问题是针对经理的

你有 3 名开发人员 (N1)、3 名测试人员 (N2) 和 3 名 DBA (N3)
选择能够成功运行项目的分歧较小的团队。

int[n] result;// where result[i] keeps the element from bucket N_i

int[n] latest;//where latest[i] keeps the latest element visited from bucket N_i

Iterate elements in (N_1 + N_2 + N_3) in sorted order
{
    Keep track of latest element visited from each bucket N_i by updating 'latest' array;

    if boundary(latest) < boundary(result)
    {
       result = latest;
    }
}

int boundary(int[] array)
{
   return Max(array) - Min(array);
}

This problem is for managers

You have 3 developers (N1), 3 testers (N2) and 3 DBAs (N3)
Choose the less divergent team that can run a project successfully.

int[n] result;// where result[i] keeps the element from bucket N_i

int[n] latest;//where latest[i] keeps the latest element visited from bucket N_i

Iterate elements in (N_1 + N_2 + N_3) in sorted order
{
    Keep track of latest element visited from each bucket N_i by updating 'latest' array;

    if boundary(latest) < boundary(result)
    {
       result = latest;
    }
}

int boundary(int[] array)
{
   return Max(array) - Min(array);
}
匿名的好友 2024-11-16 06:57:00

我有 O(K*N*log(K)),典型的执行要少得多。目前想不出更好的办法。我将首先解释更容易描述的(执行时间稍长):

  1. 对于第一个数组中的每个元素 f (循环遍历 K 个元素)
  2. 对于每个数组,从第二个数组开始(循环遍历 N-1 个数组)
  3. 进行二分搜索在数组上,找到最接近 f 的元素。这是您的元素 (Log(K))

如果您为每个数组添加一个新的下限索引,则可以优化该算法。执行二分查找时,在“Floor”到“K-1”之间搜索。
最初,Floor 索引为 0,对于第一个元素,您将搜索整个数组。一旦找到最接近“f”的元素,请使用该元素的索引更新楼层索引。最坏的情况是相同的(如果第一个数组的最大元素小于任何其他最小值,则Floor可能不会更新),但平均情况会有所改善。

I've O(K*N*log(K)), with typical execution much less. Currently cannot think anything better. I'll explain first the easier to describe (somewhat longer execution):

  1. For each element f in the first array (loop through K elements)
  2. For each array, starting from the second array (loop through N-1 arrays)
  3. Do a binary search on the array, and find element closest to f. This is your element (Log(K))

This algorithm can be optimized, if for each array, you add a new Floor Index. When performent the binary search, search between 'Floor' to 'K-1'.
Initially Floor index is 0, and for first element you search through the entire arrays. Once you find an element closest to 'f', update the Floor Index with the index of that element. Worse case is the same (Floor may not update, if maximum element of first array is smaller than any other minimum), but average case will improve.

一张白纸 2024-11-16 06:57:00

接受答案的正确性证明(终端的解决方案)

假设算法找到一系列 A=这不是最优解 (R)。

考虑 R 中的索引 j,这样项目 R[j] 是 R 中算法检查的第一项,并将其替换为其行中的下一项。

让 A' 表示该阶段(替换之前)的候选解决方案。由于R[j]=A'[j]是A'的最小值,所以它也是R的最小值。
现在,考虑 R 的最大值,R[m]。如果A'[m]<R[m],则可以通过用A'[m]代替R[m]来改进R,这与R是最优的事实相矛盾。因此,A'[m]=R[m]。
换句话说,R 和 A' 具有相同的最大值和最小值,因此它们是等价的。这就完成了证明:如果 R 是最优解,那么算法保证找到与 R 一样好的解。

Correctness proof for the accepted answer (Terminal's solution)

Assume that the algorithm finds a series A=<A[1],A[2],...,A[N]> which isn't the optimal solution (R).

Consider the index j in R, such that item R[j] is the first item among R that the algorithm examines and replaces it with the next item in its row.

Let A' denote the candidate solution at that phase (prior to the replacement). Since R[j]=A'[j] is the minimum value of A', it's also the minimum of R.
Now, consider the maximum value of R, R[m]. If A'[m]<R[m], then R can be improved by replacing R[m] with A'[m], which contradicts the fact that R is optimal. Therefore, A'[m]=R[m].
In other words, R and A' share the same maximum and minimum, therefore they are equivalent. This completes the proof: if R is an optimal solution, then the algorithm is guaranteed to find a solution as good as R.

香草可樂 2024-11-16 06:57:00

对于第一个数组中的每个元素

    choose the element in 2nd array that is closest to the element in 1st array
    current_array = 2;
    do
    {
        choose the element in current_array+1 that is closest to the element in current_array
        current_array++;
    } while(current_array < n);

复杂度:O(k^2*n)

for every element in 1st array

    choose the element in 2nd array that is closest to the element in 1st array
    current_array = 2;
    do
    {
        choose the element in current_array+1 that is closest to the element in current_array
        current_array++;
    } while(current_array < n);

complexity: O(k^2*n)

夜还是长夜 2024-11-16 06:57:00

以下是我如何解决此问题的逻辑,请记住,我们需要从 N 个数组中的每一个中选取一个元素(以计算最小最小值)

// if we take the above values as an example!
// then the idea would be to sort all three arrays while keeping another
// array to keep the reference to their sets (1 or 2 or 3, could be 
// extended to n sets)      
1   3   2   3   1   2   1   2   3    // this is the array that holds the set index
6   10  11  15  16  17  67  68  100  // this is the sorted combined array.
           |           |   
    5            2          33       // this is the computed least minimum,
                                     // the rule is to make sure the indexes of the values 
                                     // we are comparing are different (to make sure we are 
// comparing elements from different sets), then for example
// the first element of that example is index:1|value:6 we hold 
// that value 6 (that is the value we will be using to compute the least minimum, 
// then we go to the edge of the comparison which would be the second different index, 
// we skip index:3|value:10 (we remove it from the array) we compare index:2|value:11 
// to index:1|value:6 we obtain 5 which would go to a variable named leastMinimum = 5, 
// now we remove the indexes and values we already used,
// and redo the same steps.

第 1 步

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100
           |   
5            
leastMinumum = 5

第 2 步:

3   1   2   1   2   3
15  16  17  67  68  100
           |   
 2          
leastMinimum = min(2, leastMinumum) // which is equal 2

第 3 步:

1   2   3
67  68  100

    33
leastMinimum = min(33, leastMinumum) // which is equal to old leastMinumum which is 2

现在:我们假设同一个数组中的元素彼此非常接近(这次 k=2,这意味着我们只有 3 个集合,其中有两个元素)值):

// After sorting the n arrays we will have the below indexes array and values array
1   1   2   3   2   3
6   7   8   12  15  16
*       *   *

* we skip second index of 1|7 and we take the least minimum of 1|6 and 3|12 (index:2|value:8 will be removed as it is not at the edges, we pick the minimum and maximum of the unique index subset of n elements)
1   3         
6   12
 =6
* second step we remove the values we already used, so the array become like below:

1   2   3
7   15  16
*   *   * 
7 - 16
= 9

注意:
另一种消耗更多内存的方法包括创建 N 个子数组,我们将从中比较最大值 - 最小值

因此,从下面的排序值数组及其相应的索引数组中,我们提取其他三个子数组:

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100

第一个数组:< /em>

1   3   2 
6   10  11

11-6 = 5

第二个数组:

3   1   2
15  15  17

17-15 = 2

第三个​​数组:

1   2   3
67  68  100

100 - 67 = 33

Here is my logic on how to resolve this issue, keeping in mind that we need to pick one element from each of the N arrays (to compute the least minimum)

// if we take the above values as an example!
// then the idea would be to sort all three arrays while keeping another
// array to keep the reference to their sets (1 or 2 or 3, could be 
// extended to n sets)      
1   3   2   3   1   2   1   2   3    // this is the array that holds the set index
6   10  11  15  16  17  67  68  100  // this is the sorted combined array.
           |           |   
    5            2          33       // this is the computed least minimum,
                                     // the rule is to make sure the indexes of the values 
                                     // we are comparing are different (to make sure we are 
// comparing elements from different sets), then for example
// the first element of that example is index:1|value:6 we hold 
// that value 6 (that is the value we will be using to compute the least minimum, 
// then we go to the edge of the comparison which would be the second different index, 
// we skip index:3|value:10 (we remove it from the array) we compare index:2|value:11 
// to index:1|value:6 we obtain 5 which would go to a variable named leastMinimum = 5, 
// now we remove the indexes and values we already used,
// and redo the same steps.

Step 1:

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100
           |   
5            
leastMinumum = 5

Step 2:

3   1   2   1   2   3
15  16  17  67  68  100
           |   
 2          
leastMinimum = min(2, leastMinumum) // which is equal 2

Step 3:

1   2   3
67  68  100

    33
leastMinimum = min(33, leastMinumum) // which is equal to old leastMinumum which is 2

Now: We suppose we have elements from the same array that are very close to each other (k=2 this time which means we only have 3 sets with two values) :

// After sorting the n arrays we will have the below indexes array and values array
1   1   2   3   2   3
6   7   8   12  15  16
*       *   *

* we skip second index of 1|7 and we take the least minimum of 1|6 and 3|12 (index:2|value:8 will be removed as it is not at the edges, we pick the minimum and maximum of the unique index subset of n elements)
1   3         
6   12
 =6
* second step we remove the values we already used, so the array become like below:

1   2   3
7   15  16
*   *   * 
7 - 16
= 9

Note:
Another approach that consumes more memory would consist of creating N sub-arrays from which we would be comparing the maximum - minumum

So from the below sorted values array and its corresponding indexes array we extract three other sub arrays:

1   3   2   3   1   2   1   2   3
6   10  11  15  16  17  67  68  100

First Array:

1   3   2 
6   10  11

11-6 = 5

Second Array:

3   1   2
15  15  17

17-15 = 2

Third Array:

1   2   3
67  68  100

100 - 67 = 33

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文