sub设置一个多级数据框，按Python中每个第二级索引的最高值设置

发布于 2025-02-12 17:06:06 字数 430 浏览 0 评论 0原文

我有一个多级索引数据框，看起来像下图。我已经对Perc_viewed列的较高的第二级索引进行了排序。因此，对于每个第一级索引，首先列出男性（M）或女性（F）的较高索引。现在，我想仅访问第一级索引值和第二级索引值的第一个值（顶部一个） - 只有这两个索引值并在我的写入中使用它们作为每个类别中的“顶部”。我该怎么做，请

在此处输入图像描述

此数据的另一种视图如下。在这种情况下，我首先突出显示的值是我唯一要访问的值。

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

机场等船 2025-02-19 17:06:06

从您的帖子中重现数据：

arrays = [
    np.array(["informational_0_0_4", "informational_0_0_4", "informational_0_0_3", "informational_0_0_3", "discount_5_20_10", "discount_5_20_10", "discount_3_7_7", "discount_3_7_7"]),
    np.array(["F", "M", "M", "F", "F", "M", "F", "M"]),
]

data = {'event':[2797, 3860, 3805, 2838, 2849, 3877, 2773, 3882],'viewed_count':[1551, 1936, 3433, 2440, 1010, 1205, 2669, 3710]}

df = pd.DataFrame(data, index=arrays)

df['perc_viewed'] = (df['viewed_count']*100/df['event']).round(2)
df

以下是从level = 0 中获取最大值的方法。

df.groupby(level=0).head(1)

Reproduce the data from your post:

arrays = [
    np.array(["informational_0_0_4", "informational_0_0_4", "informational_0_0_3", "informational_0_0_3", "discount_5_20_10", "discount_5_20_10", "discount_3_7_7", "discount_3_7_7"]),
    np.array(["F", "M", "M", "F", "F", "M", "F", "M"]),
]

data = {'event':[2797, 3860, 3805, 2838, 2849, 3877, 2773, 3882],'viewed_count':[1551, 1936, 3433, 2440, 1010, 1205, 2669, 3710]}

df = pd.DataFrame(data, index=arrays)

df['perc_viewed'] = (df['viewed_count']*100/df['event']).round(2)
df

Here's the approach that fetches the max value from level=0 after you already sorted dfin descending order.