Python:从某个列表中获取最大N个元素
是否有一些函数可以返回某个列表中的 N 个最高元素?
即,如果 max(l)
返回单个最高元素,某物。像 max(l, count=10)
会返回 10 个最大数字的列表(如果 l
更小,则更少)。
或者什么是获得这些的有效简单方法? (除了明显的规范实现之外;而且,没有涉及首先对整个列表进行排序的事情,因为与规范解决方案相比,这效率低下。)
Is there some function which would return me the N highest elements from some list?
I.e. if max(l)
returns the single highest element, sth. like max(l, count=10)
would return me a list of the 10 highest numbers (or less if l
is smaller).
Or what would be an efficient easy way to get these? (Except the obvious canonical implementation; also, no such things which involve sorting the whole list first because that would be inefficient compared to the canonical solution.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
heapq.nlargest
:heapq.nlargest
:标准库中执行此操作的函数是
heapq.nlargest
The function in the standard library that does this is
heapq.nlargest
从 L 中的前 10 个开始,称为 X。注意 X 的最小值。
在 L[i] 上循环 i,在 L 的其余部分上循环。
如果 L[i] 大于 min(X),则删除 min(X ) 从 X 并插入 L[i]。您可能需要将 X 保留为排序链表并进行插入。更新最小值(X)。
最后,你得到了 X 中的 10 个最大值。
我怀疑这将是 O(kN)(这里 k 是 10),因为插入排序是线性的。可能是 gsl 使用的,所以如果您可以阅读一些 C 代码:
http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html
可能是numpy 就是这样做的。
Start with the first 10 from L, call that X. Note the minimum value of X.
Loop over L[i] for i over the rest of L.
If L[i] is greater than min(X), drop min(X) from X and insert L[i]. You may need to keep X as a sorted linked list and do an insertion. Update min(X).
At the end, you have the 10 largest values in X.
I suspect that will be O(kN) (where k is 10 here) since insertion sort is linear. Might be what gsl uses, so if you can read some C code:
http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html
Probably something in numpy that does this.
一个相当有效的解决方案是快速排序的变体,其中递归仅限于枢轴的右侧部分,直到枢轴点位置高于所需的元素数量(当然还有一些额外的条件来处理边界情况)。
标准库有
heapq.nlargest
,正如其他人在这里指出的那样。A fairly efficient solution is a variation of quicksort where recursion is limited to the right part of the pivot until the pivot point position is higher than the number of elements required (with a few extra conditions to deal with border cases of course).
The standard library has
heapq.nlargest
, as pointed out by others here.如果您不介意使用 pandas,那么:
上面的代码将显示 N 个最大值以及每个值的索引位置。
Pandas nlargest 文档
If you do not mind using pandas then:
The above code will show you the N largest values along with the index position of each value.
Pandas nlargest documentation