Python:从某个列表中获取最大N个元素

发布于 2024-10-03 00:18:59 字数 234 浏览 3 评论 0原文

是否有一些函数可以返回某个列表中的 N 个最高元素?

即,如果 max(l) 返回单个最高元素,某物。像 max(l, count=10) 会返回 10 个最大数字的列表(如果 l 更小,则更少)。

或者什么是获得这些的有效简单方法? (除了明显的规范实现之外;而且,没有涉及首先对整个列表进行排序的事情,因为与规范解决方案相比,这效率低下。)

Is there some function which would return me the N highest elements from some list?

I.e. if max(l) returns the single highest element, sth. like max(l, count=10) would return me a list of the 10 highest numbers (or less if l is smaller).

Or what would be an efficient easy way to get these? (Except the obvious canonical implementation; also, no such things which involve sorting the whole list first because that would be inefficient compared to the canonical solution.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

池木 2024-10-10 00:18:59

heapq.nlargest

>>> import heapq, random
>>> heapq.nlargest(3, (random.gauss(0, 1) for _ in xrange(100)))
[1.9730767232998481, 1.9326532289091407, 1.7762926716966254]

heapq.nlargest:

>>> import heapq, random
>>> heapq.nlargest(3, (random.gauss(0, 1) for _ in xrange(100)))
[1.9730767232998481, 1.9326532289091407, 1.7762926716966254]
玩心态 2024-10-10 00:18:59

标准库中执行此操作的函数是 heapq.nlargest

The function in the standard library that does this is heapq.nlargest

野侃 2024-10-10 00:18:59

从 L 中的前 10 个开始,称为 X。注意 X 的最小值。

在 L[i] 上循环 i,在 L 的其余部分上循环。

如果 L[i] 大于 min(X),则删除 min(X ) 从 X 并插入 L[i]。您可能需要将 X 保留为排序链表并进行插入。更新最小值(X)。

最后,你得到了 X 中的 10 个最大值。

我怀疑这将是 O(kN)(这里 k 是 10),因为插入排序是线性的。可能是 gsl 使用的,所以如果您可以阅读一些 C 代码:

http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html

可能是numpy 就是这样做的。

Start with the first 10 from L, call that X. Note the minimum value of X.

Loop over L[i] for i over the rest of L.

If L[i] is greater than min(X), drop min(X) from X and insert L[i]. You may need to keep X as a sorted linked list and do an insertion. Update min(X).

At the end, you have the 10 largest values in X.

I suspect that will be O(kN) (where k is 10 here) since insertion sort is linear. Might be what gsl uses, so if you can read some C code:

http://www.gnu.org/software/gsl/manual/html_node/Selecting-the-k-smallest-or-largest-elements.html

Probably something in numpy that does this.

萌梦深 2024-10-10 00:18:59

一个相当有效的解决方案是快速排序的变体,其中递归仅限于枢轴的右侧部分,直到枢轴点位置高于所需的元素数量(当然还有一些额外的条件来处理边界情况)。

标准库有 heapq.nlargest,正如其他人在这里指出的那样。

A fairly efficient solution is a variation of quicksort where recursion is limited to the right part of the pivot until the pivot point position is higher than the number of elements required (with a few extra conditions to deal with border cases of course).

The standard library has heapq.nlargest, as pointed out by others here.

晨光如昨 2024-10-10 00:18:59

如果您不介意使用 pandas,那么:

import pandas as pd
N = 10
column_name = 0
pd.DataFrame(your_array).nlargest(N, column_name)

上面的代码将显示 N 个最大值以及每个值的索引位置。

Pandas nlargest 文档

If you do not mind using pandas then:

import pandas as pd
N = 10
column_name = 0
pd.DataFrame(your_array).nlargest(N, column_name)

The above code will show you the N largest values along with the index position of each value.

Pandas nlargest documentation

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文