有效地减少数据框中的组大小

发布于 2025-02-11 13:13:46 字数 478 浏览 1 评论 0原文

我有一个使用GroupBy函数根据每行的名称进行分组的数据框。然后，我想将每个组减少到给定的大小。然后，我将这些组添加到数据库中以用于其他过程。目前，我正在循环中这样做，但这似乎确实效率低下。是否有一种方法必须更有效地这样做？

grouped = df.groupby(['NAME'])

total = grouped.ngroups

df_final = pd.DataFrame()
for name, group in grouped:

    target_number_rows = 10

    if len(group.index) > target_number_rows:
        shortened = group[::int(len(group.index) / target_number_rows)]
        df_final = pd.concat([df_final, shortened], ignore_index=True)

原文

I have a dataframe which I am grouping based on the names of each row using the groupby function. I then want to reduce each group to a given size. I then add these groups back into a database to use for other processes. Currently I am doing this in a for loop but this seems really inefficient. Is there a method which pandas has to do this more efficiently?

grouped = df.groupby(['NAME'])

total = grouped.ngroups

df_final = pd.DataFrame()
for name, group in grouped:

    target_number_rows = 10

    if len(group.index) > target_number_rows:
        shortened = group[::int(len(group.index) / target_number_rows)]
        df_final = pd.concat([df_final, shortened], ignore_index=True)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

懷念過去 2025-02-18 13:13:46

按名称组并应用示例（在该组中随机n随机n），其中n是您所需的金额或该组的完整金额，例如：

out = df.groupby('NAME').apply(lambda g: g.sample(min(len(g), target_number_rows)))

否则，请使用第一个N或最后一个，例如：

out = df.groupby('NAME').head(target_number_rows)
# or...
out = df.groupby('NAME').tail(target_number_rows)

Group by the name and apply a sample (that'll take randomly N within that group) where N is either your desired amount or the complete amount for that group, eg:

out = df.groupby('NAME').apply(lambda g: g.sample(min(len(g), target_number_rows)))

Otherwise, take the first N or last N, eg:

out = df.groupby('NAME').head(target_number_rows)
# or...
out = df.groupby('NAME').tail(target_number_rows)

回复收藏 0 原文

~没有更多了~

关于作者

雨的味道风的声音

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

有效地减少数据框中的组大小

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

有效地减少数据框中的组大小

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。