查找已排序行矩阵中的第 K 个最小元素
这是一个面试问题。
在排序的行矩阵中找到最小的元素,但不能排序的列,并且行之间没有关系。 (第一行和第n行之间没有关系 - 所知的只是每个行都按顺序排列)
输入是这样的:
[[1,50,60],
[20,30,40],
[2,3,4]]
k = 5
在这种情况下,输出是
20
因为20是此矩阵中的第五小元素。
我首先想到将所有元素添加到Minheap中,然后将元素进行轮询,同时从K中减去一个元素,直到我们得到答案。我还考虑过针对O(M*N)解决方案实现QuickSelect,尽管这些解决方案并没有真正利用所有行按升序排序的事实。
解决此问题的最佳方法是什么?考虑到这一点后,我意识到这似乎是一个“合并k分类列表”的问题,在找到最小的元素后,我们在哪里停下来。
谢谢
This is an interview question.
Find the Kth smallest element in a matrix of sorted rows, but NOT sorted columns and no relations between the rows. (the first row and nth row have no relation between them - all that is known is that each row is in ascending order)
An example input is this:
[[1,50,60],
[20,30,40],
[2,3,4]]
k = 5
And the output in this scenario would be
20
because 20 is the 5th smallest element in this matrix.
I first thought of adding all the elements to a minHeap, then polling the elements while subtracting one from k each iteration until we have our answer. I also thought about implementing quickselect for a O(m*n) solution, although these solutions dont really take advantage of the fact that all the rows are sorted in ascending order.
What is the optimal way to solve this problem? After thinking about it, I realize it seems like a 'merge k sorted lists' question, where we stop after we find the kth smallest element.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
假设以下情况,
由于每行具有相同的开始/结束元素,并且任何两行之间没有关系,因此如果不使用一些强力强制,很难确定任何有用的信息。最简单的解决方案应该是
这比完整的暴力破解有一些微小的改进,但仍然有
O(m*n)< /code> 时间复杂度和
O(k)
空间复杂度。编辑:
您可以在
O(mlog(m))
时间内根据矩阵行的第一个元素对矩阵行进行排序(使用最后一个元素打破平局),以减少堆压入/弹出操作的数量。 (对于单调递减矩阵,您将在每种情况下推送/弹出堆元素,这会降低性能。)Assume the following case
Since each row has the same starting/ending element and there's no relation between any two rows, it's hard to determine any useful information without using some brute forcing. The simplest solution should be
k
elements of the matrixmax_heap.top
, pop the heap and add the current value to it. Else, stop the iteration for the current row (since we can't have thekth
smallest element here)This has minor improvements over complete bruteforcing, but still has
O(m*n)
time andO(k)
space complexity.Edit:
You can sort the matrix rows based on their first element (break ties using the last element) in
O(mlog(m))
time to reduce the number of heap push/pop operations. (For a monotonically decreasing matrix, you'll be pushing/popping a heap element at each case, which would degrade performance.)一种确实利用排序行的方法是嵌套二分搜索,当值的范围不太大时,这种方法速度更快。如果您的矩阵有
m
行和n
列,并且某些Range
等于max(Matrix)-min(Matrix)+1
,时间复杂度可以写成O(m * (log n) * (log Range))
算法背后的思想就是
kth
的定义> 最小元素为k-1
的值矩阵的元素小于该值。”在矩阵中,这变成
k-1
”。在单行中,我们可以使用二分查找在
log n
时间内计算有多少元素小于某个x
。对于整个矩阵,这需要m * log n
时间。最初,您应该扫描矩阵的第一列和最后一列,以分别查找矩阵中外部二分搜索边界的最小值和最大值。这是算法其余部分的伪代码:
low
high
low
high 高:- 让
- 使用二分查找计算矩阵中小于
- 如果此计数小于
- 否则:设置
中=(低+高)/2
mid
的元素数量k-1
:设置low = mid + 1
高=中
low
代码>One approach that does take advantage of the sorted rows is nested binary search, which is faster when the range of values isn't too large. If your matrix has
m
rows andn
columns, and someRange
equal tomax(Matrix)-min(Matrix)+1
, the time complexity can be written asO(m * (log n) * (log Range))
The idea behind the algorithm is just the definition of
kth
smallest element ask-1
elements of the matrix are smaller than the value."In a matrix, this becomes
k-1
".In a single row, we can count how many elements are smaller than some
x
using binary search inlog n
time. For the whole matrix, this takesm * log n
time. Initially, you should scan the first and last columns of the matrix to find the smallest and largest values in the matrix, respectively, for your outer binary search bounds.Here's the pseudocode for the rest of the algorithm:
low
high
low < high
:mid = (low + high)/2
mid
in the matrix, using binary searchk-1
: Setlow = mid + 1
high = mid
low
解决方案的关键是所有行都是分类的。我将建议以下建议:
给定一个排序的矩阵
a
:0
TH的TH索引值,其相应的行号为min_heap
堆中的位置由值定义,并且行号用于时间优化。index_vec = zeros([1,n])
是lngth等于向量中所有值的行数0
,因为起始索引为0
(n
是行的数量)min_heap
的顶部,然后推入min_value_tuple
index_vec [min_value_tuple [1]] += 1
a [chosen_row,index_vec [min_value_tuple [1]]]
进入min堆,作为一个带有相应行号的元重。min_heap
的顶部note 如果在步骤3中的循环中。到6。
index_vec [ii] = m
,其中m是柱状的数量,我们不会将新数字推到min_heap
,而只是将堆构成以适应新的根。时间复杂性
总运行时为o(n + klong(n))。
最坏的情况是
k = m*n/2
。在这种情况下,时间复杂性将为O(n + nmlog(n))= o(nmlog(n))。空间
堆
带大小
n
的需要大小
n
和index_vec &gt; n*m/2
如果在翻转a
的行后,矩阵中最大的数字找到了矩阵中的最大数字,则可以解决等效问题。The key to the solution is that all the rows are sorted. I will suggest the following:
Given a row-ordered matrix
A
:0
th index value of each row and their corresponding row numbers asmin_heap
where the location in the heap is defined by the value and the row number is for time optimization.index_vec = zeros([1, n])
to be the lngth equal to the number of rows where all the values in the vector are0
since the starting index is0
(n
is the number of rows)min_heap
and push intomin_value_tuple
index_vec[min_value_tuple[1]] += 1
A[chosen_row, index_vec[min_value_tuple[1]]]
into the min heap as a tuple with the corresponding row number.min_heap
and return itNote if during the loop in steps 3. through 6.
index_vec[ii] = m
where m is the number of colums, we will not push a new number tomin_heap
and just orgenize the heap to accomodate a new root.Time complexity
n
numbers in the heapTotal runtime will be O(n + klong(n)).
Worst case is for
k=m*n/2
. In that case the time complexity will be O(n + nmlog(n)) = O(nmlog(n)).Space
This requires a heap with size
n
andindex_vec
with sizen
as well so total space is O(n)Note
If
k > n*m/2
you can solve an equivalent problem if finding then*m-k
largest number in the matrix after flipping the rows ofA
.按行对矩阵进行排序,然后在行元素
k
times上迭代。Sort the matrix by row, followed by iterating over the row element
k
times.让我们定义一个存储一对值的最小堆,那么该算法由以下步骤组成:
1-迭代矩阵第一列的值,并将每个值作为一对 (value, row指数)。
2-重复以下步骤k-1次:将堆顶对的行索引存储在idx中,然后删除该对。现在转到索引等于 idx 的行,并以与之前相同的方式将其中下一个未使用的元素添加到堆中。经过 k-1 次迭代后,顶部元素将包含第 K 个元素。
第一步的时间复杂度是 O(n.log n),其中 n 是矩阵的行数。第二步的时间复杂度是 O(k.log n),因为我们将从堆中删除/添加 k-1 个元素,并且在每个时刻堆包含不超过 n 个元素。所以总的时间复杂度是O((n+k)log n)。
Lets define a min-heap that stores a pair of values, then the algorithm consists of the following steps:
1- iterate through the values of the first column of the matrix and add each value to the heap as a pair of (value, row index).
2- repeat the following step exactly k-1 times: store the row index of the top pair of the heap in idx, then delete this pair. Now go to the row with index equal to idx and add the next unused element in it to the heap in the same way as before. After k-1 iterations the top element would contain the K-th element.
The time complexity of the first step is O(n.log n) such that n is the number of rows of the matrix. The time comexity of the second step is O(k.log n) because we will delet/add k-1 elements from/to the heap and at each moment the heap contains no more than n elements. So the total time complexity is O((n+k)log n).
对每行的前 k 个元素运行快速选择 O(kn) 时间。
Run quickselect on the first k elements of each row for O(kn) time.