如何有效地找到数组中每个元素的排名(在平局的情况下求平均值)?例如:
float[] rank(T)(T[] input) {
// Implementation
}
auto foo = rank([3,6,4,2,2]); // foo == [3, 5, 4, 1.5, 1.5]
我能想到的唯一方法需要分配 3 个数组:
- 输入数组的副本,因为它必须进行排序,而我们不拥有它。
- 用于跟踪输入数组排序顺序的数组。
- 要返回的排名数组。
有谁知道如何在 O(N log N) 时间和 O(1) 辅助空间(意味着我们必须分配的唯一数组就是我们要返回的数组)中做到这一点,或者至少摆脱其中之一上面的三个数组?
How does one find the rank of each element of an array, averaging in the case of ties, efficiently? For example:
float[] rank(T)(T[] input) {
// Implementation
}
auto foo = rank([3,6,4,2,2]); // foo == [3, 5, 4, 1.5, 1.5]
The only way I can think of doing it requires allocating 3 arrays:
- A duplicate of the input array because it has to be sorted and we don't own it.
- An array to keep track of the order in which the input array was sorted.
- An array of ranks to return.
Does anyone know how to do this in O(N log N) time and O(1) auxiliary space (meaning the only array we have to allocate is the one we're going to return), or at least get rid of one of the three arrays above?
发布评论
评论(7)
您可以分配要返回的数组(我们称之为 R),将其初始化为 0..n-1,然后对传入数组(称为 I)进行“排序”,但使用比较 I[R[k]] 与I[R[j]] 而不是普通的 R[k] 与 R[j],然后根据需要交换 R 数组中的值(而不是像往常一样交换 I 数组中的值)。
您可以使用快速排序或堆排序(或冒泡排序,但这会扰乱您的复杂性)来实现这种间接排序。
您只需要分配一个数组 - 以及一些用于索引的堆栈空间。
You can allocate the array you are going to return (let's call it R), initialize it to 0..n-1 and then "sort" the incoming array (called I) but using the comparison I[R[k]] vs. I[R[j]] instead of the normal R[k] vs. R[j] and then swapping the values in the R array as needed (instead of the values in the I array as usual).
You can implement this indirect sorting using either quicksort or heapsort (or bubblesort but that will mess up your complexity).
You only need to allocate one array - and some stack space for the indices.
好的,所以您将输入数组复制到
foo
中。使用 heapsortfoo 进行就地排序一个>。现在,获取输入数组的第一个元素,并使用 foo 中的排名rel="nofollow noreferrer">二分搜索并将排名插入到ranks
数组中并返回。现在,您使用 2 个数组而不是 3 个。
Ok, so you duplicate your input array into
foo
. Sortfoo
in-place in O(n log n) time with heapsort. Now, take the first element of your input array and find its rank infoo
in O(log n) time using binary search and insert the rank into theranks
array and return it.Now, you use 2 arrays instead of 3.
如果您不拥有该数组,我认为不可能在 O(N log N) 和空间 O(1) 中完成它。
如果元素的范围(元素可以有多大)很小,请使用计数。计算每个元素有多少个,然后使用计数数组根据输入数组计算结果数组。
If you don't own the array I don't think it's possible to do it in O(N log N) and in space O(1).
If the range of elements (how large an element can be) is small, use counting. Count how many are there of each element and then calculate the result array based on the input array using the counting array.
为什么不直接复制数组并对其进行排序,然后从那里开始呢?有很多可用的就地排序算法,例如堆排序。
Why not just copy and sort the array and go from there? There are plenty of in-place sort algorithms available such as heapsort.
也许总结弗洛林的答案< /a> (以及相关的注释)和一些简单的代码。
以下是在 Ruby 中执行此操作的方法:
在 Python 中:
ranks 数组告诉您 0 具有排名 2,1 具有排名 1,2 具有排名 4,等等。(当然,这些排名从 0 开始,而不是从 1 开始。)
Perhaps it would be useful to summarize florin's answer (and the associated comments) with some simple code.
Here's how to do it in Ruby:
And in Python:
The ranks array tells you that 0 has rank 2, 1 has rank 1, 2 has rank 4, etc. (Of course, those ranks start at zero, not one.)
如何使用二叉搜索树并将元素一一插入到该二叉搜索树中。然后,可以通过对出现在我们想要使用 BST 的按序遍历查找排名的元素节点左侧的所有元素进行计数来确定排名。
How about using a Binary search tree and inserting elements one by one into that BST. Rank can then be determined by keeping a counter on all the elemtns appearing left of the element node we want to find rank of using In order Traversal of BST.
我用它在 python 中快速而肮脏地完成了它:
第一个示例在原始列表中没有重复项的情况下有效。它可以做得更好,但我正在玩一些技巧并想出了这个。如果您有重复项,第二个就可以了。
I've used this for doing it quick and dirty in python:
First example would work in the case you don't have duplicates in your original list. It can be done better, but I was playing with some hacks and came out with this. Second one would work if you have duplicates.