O(n) 算法求 n² 的中位数隐含数字
问题:输入是一个(不一定是排序的)序列 S = k1, k2, ..., kn,由 n 个任意数字组成。考虑 min{ki,kj} 形式的 n² 个数字的集合 C,其中 1 <=i,j<=n。提出一个 O(n)
时间和 O(n)
空间算法来查找 C 的中值。
到目前为止,我通过检查不同集合 S 的 C 发现, C 中 S 中最小数的实例数等于 (2n-1),下一个最小数:(2n-3) 等等,直到只有一个最大数的实例。
有没有办法利用这些信息来找到 C 的中位数?
Problem: input is a (not necessarily sorted) sequence S = k1, k2, ..., kn of n arbitrary numbers. Consider the collection C of n² numbers of the form min{ki,kj}, for 1 <=i, j<=n. Present an O(n)
time and O(n)
space algorithm to find the median of C.
So far I've found by examining C for different sets S that the number of instances of the smallest number in S in C is equal to (2n-1), the next smallest number: (2n-3) and so on until you only have one instance of the largest number.
Is there a way to use this information to find the median of C?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
有多种可能性。我喜欢的一个是 Hoare 的
Select
算法。基本思想类似于快速排序,不同之处在于,当您递归时,您仅递归到保存您要查找的数字的分区。例如,如果您想要 100 个数字的中位数,您可以首先对数组进行分区,就像快速排序一样。您将得到两个分区——其中一个包含第 50 个元素。在该分区中递归地执行您的选择。继续下去,直到您的分区仅包含一个元素,该元素将成为中位数(请注意,您可以对您选择的另一个元素执行相同的操作)。
There are a number of possibilities. One I like is Hoare's
Select
algorithm. The basic idea is similar to a Quicksort, except that when you recurse, you only recurse into the partition that will hold the number(s) you're looking for.For example, if you want the median of 100 numbers, you'd start by partitioning the array, just like in Quicksort. You'd get two partitions -- one of which contains the 50th element. Recursively carry out your selection in that partition. Continue until your partition contains only one element, which will be the median (and note that you can do the same for another element of your choice).
是的,很好的拼图。我们可以发现中位数沿着你所说的方向发展。
在 C 中,max(k) 出现 1 次,次高出现 3 次,次高出现 5 次,依此类推
如果我们对 C 的元素进行排序,则第 m 个最高数字左侧的元素数量为 m^2 (奇数之和)
我们感兴趣的数字(计算中位数) )
一个。如果 n 是奇数则 (n^2+1)/2 = alpha
b.如果 n 为偶数,则 alpha1 = n^2/2 且 alpha2 = n^2/2+1
但 alpha1=n^2/2 永远不是平方数 =>紧邻 alpha1 右侧的数字等于 alpha1(前 m 个奇数之和为平方)=> alpha1=alpha2。
因此归结为确定 m,使得 m^2(前 m 个奇数之和)略高于 (n^2/2)
因此归结为确定m=ceiling(n/sqrt(2)和原始序列中第m个最高的数字。(是否找到第 m 个最高或第 (nm-1) 个最低是优化。
Yes, good puzzle. We can find median developing on the lines you said.
In C we have 1 occurence of max(k), 3 occurrence of next highest, 5 of next highest and so on
If we ordered elements of C, number of elements on the left of mth highest number is m^2 (sum of odd numbers)
The numbers that we are interested in (to calculate median)
a. If n is odd is (n^2+1)/2 = alpha
b. If n is even then alpha1 = n^2/2 and alpha2 = n^2/2+1
but alpha1=n^2/2 is never a square number => the number immediately on the right of alpha1 is equal to alpha1 (sum of first m odd numbers is square) => alpha1=alpha2.
So it boils down to determining m such that m^2 (sum of first m odd numbers) is just higher than (n^2/2)
So it boils down to determining m=ceiling(n/sqrt(2) and mth highest number in original sequence. (Whether to find mth highest or (n-m-1)th lowest is optimization).
We can easily find mth highest number (just keep noting first m largest number from left) or use median of medians algortithm to do it in linear time.
维基百科有一篇关于选择算法的好文章。如果您使用 C++,STL 包含线性时间的 nth_element() 算法平均而言。
Wikipedia has a good article on Selection algorithms. If you are using C++, the STL includes a nth_element() algorithm with linear time on average.