当前位置：文江博客话题详情

Python pandas performance dataframe processing-efficiency

在列中找到最大值的3行的最有效方法？

发布于 2025-01-27 00:08:30 字数 587 浏览 4 评论 0原文

让我们说有一个dataframe df

Name  Balance
A     1000
B     5000
C     3000
D     6000
E     2000
F     5000

我正在寻找一种方法，通过这种方法，我可以在所有方面获得最高余额的三行。

df['balance'].get_indices_max(n=3) # where is no. of results required

输出这些索引将用于获取行何时：

D 6000
F 5000
B 5000

更新：有关可接受的答案的额外说明

可能的“保持”值 -

first : prioritize the first occurrence(s)

last : prioritize the last occurrence(s)

all : do not drop any duplicates, even it means selecting more than n items.

Lets us say there is a dataframe df

Name  Balance
A     1000
B     5000
C     3000
D     6000
E     2000
F     5000

I am looking for an approach through which I can get three rows with highest balances among all.

df['balance'].get_indices_max(n=3) # where is no. of results required

Output when these indices will be used to get rows:

D 6000
F 5000
B 5000

UPDATE : EXTRA NOTES REGARDING THE ACCEPTED ANSWER

Possible "keep" values -

first : prioritize the first occurrence(s)

last : prioritize the last occurrence(s)

all : do not drop any duplicates, even it means selecting more than n items.

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（2）

春风十里 2025-02-03 00:08:30

回答

df = Df({"Name":list("ABCDEF"), "Balance":[1000,5000,3000,6000,2000,5000]})
index = df["Balance"].nlargest(3).index
df.loc[index]

输出

  Name  Balance
3    D     6000
1    B     5000
5    F     5000

attantion

表现

未指定的列也将返回，但不用于订购。
此方法等于df.sort_values（列，casting = false）.head（n），但性能更多。

nlargest（3，keep ='all'）

保持{'first'，'last'，'all'}，默认'first'
使用keep ='all'时，所有重复的项目均已维护

示例

df = Df({"Name":list("ABCDEFX"), "Balance":[1000,5000,3000,6000,2000,5000,5000]})
index = df["Balance"].nlargest(3, keep='all').index
df.loc[index]

  Name  Balance
3    D     6000
1    B     5000
5    F     5000
6    X     5000

参考

dataframe.nlargest

Answer

df = Df({"Name":list("ABCDEF"), "Balance":[1000,5000,3000,6000,2000,5000]})
index = df["Balance"].nlargest(3).index
df.loc[index]

Output

  Name  Balance
3    D     6000
1    B     5000
5    F     5000

Attantion

Performant

The columns that are not specified are returned as well, but not used for ordering.
This method is equivalent to df.sort_values(columns, ascending=False).head(n), but more performant.

nlargest(3, keep='all')

keep{‘first’, ‘last’, ‘all’}, default ‘first’
When using keep='all', all duplicate items are maintained

Example

df = Df({"Name":list("ABCDEFX"), "Balance":[1000,5000,3000,6000,2000,5000,5000]})
index = df["Balance"].nlargest(3, keep='all').index
df.loc[index]

  Name  Balance
3    D     6000
1    B     5000
5    F     5000
6    X     5000

Reference

DataFrame.nlargest

回复收藏 0 原文

阿楠 2025-02-03 00:08:30

我通常这样做

out = df.sort_values('Balance').iloc[3:]
Out[476]: 
  Name  Balance
1    B     5000
5    F     5000
3    D     6000

I usual do

out = df.sort_values('Balance').iloc[3:]
Out[476]: 
  Name  Balance
1    B     5000
5    F     5000
3    D     6000

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

文章

评论

21359 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

qcm89757

文章 0 评论 0

掩于岁月

文章 0 评论 0

陌客

文章 0 评论 0

灰灰天

文章 0 评论 0

A

文章 0 评论 0

深者入戏

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文