sub设置一个多级数据框,按Python中每个第二级索引的最高值设置

发布于 2025-02-12 17:06:06 字数 430 浏览 0 评论 0原文

我有一个多级索引数据框,看起来像下图。我已经对Perc_viewed列的较高的第二级索引进行了排序。因此,对于每个第一级索引,首先列出男性(M)或女性(F)的较高索引。现在,我想仅访问第一级索引值和第二级索引值的第一个值(顶部一个) - 只有这两个索引值并在我的写入中使用它们作为每个类别中的“顶部”。我该怎么做,请

在此处输入图像描述

此数据的另一种视图如下。在这种情况下,我首先突出显示的值是我唯一要访问的值。

在此处输入图像描述

I have a multilevel index dataframe that looks like the image below. I have sorted the second level index on the higher of the perc_viewed column. So for each 1st level index, the higher of Male(M) or Female(F) is listed first. Now I want to access just the 1st level index values and the first of the second level index value (the top one) -- only those two and use them in my write up for my analysis as the "top" in each category. How may I do this please

enter image description here

Another view of this data is as below. In this case the values appearing first that I have highlighted are the only ones that I want to access.

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

机场等船 2025-02-19 17:06:06

从您的帖子中重现数据:

arrays = [
    np.array(["informational_0_0_4", "informational_0_0_4", "informational_0_0_3", "informational_0_0_3", "discount_5_20_10", "discount_5_20_10", "discount_3_7_7", "discount_3_7_7"]),
    np.array(["F", "M", "M", "F", "F", "M", "F", "M"]),
]

data = {'event':[2797, 3860, 3805, 2838, 2849, 3877, 2773, 3882],'viewed_count':[1551, 1936, 3433, 2440, 1010, 1205, 2669, 3710]}

df = pd.DataFrame(data, index=arrays)

df['perc_viewed'] = (df['viewed_count']*100/df['event']).round(2)
df

以下是从level = 0 中获取最大值的方法。

df.groupby(level=0).head(1)

Reproduce the data from your post:

arrays = [
    np.array(["informational_0_0_4", "informational_0_0_4", "informational_0_0_3", "informational_0_0_3", "discount_5_20_10", "discount_5_20_10", "discount_3_7_7", "discount_3_7_7"]),
    np.array(["F", "M", "M", "F", "F", "M", "F", "M"]),
]

data = {'event':[2797, 3860, 3805, 2838, 2849, 3877, 2773, 3882],'viewed_count':[1551, 1936, 3433, 2440, 1010, 1205, 2669, 3710]}

df = pd.DataFrame(data, index=arrays)

df['perc_viewed'] = (df['viewed_count']*100/df['event']).round(2)
df

Here's the approach that fetches the max value from level=0 after you already sorted dfin descending order.

df.groupby(level=0).head(1)

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文