groupby,然后将特定行转换为相同数据框中的库尔姆

发布于 2025-01-24 10:48:53 字数 1489 浏览 0 评论 0 原文

首先,我有此数据框架:

ID 年龄 名称 时间
0 1 12 r Y
1 1 13 C Y
2 1 14 N Y
3 1 15 M 想要
4 2 11 L N
5 2 22 K N
6 2 33 R N
7 2 55 L N
  • 首先我 到 Groupby ID (所以我将拥有1组& 2)

  • ,然后从 [age]列中 iD dataframe,我只需要从每个gorup .so的前2行和最后一行。最后一行= 15。当然,我需要为gorup 2和

    做同样的事情
  • 。其余的列 [name]& [time] 在数据框架的goruped中,i 仅需要最后一行,因此对于组(1) ,从[时间]列我需要最后一行,它是=y。

  • 到最后,我只有每个ID

    只有一行

这是我的预期/所需输出:

ID 年龄1岁 年龄3 年龄3 时间
0 1 12 13 15 m y
1 2 11 22 55 l n

First I have this data frame:

ID Age name time
0 1 12 r y
1 1 13 c y
2 1 14 n y
3 1 15 m y
4 2 11 l N
5 2 22 k N
6 2 33 r N
7 2 55 l N
  • First I want to groupby ID ( so I will have group 1 & 2)

  • Then from the [Age] column in the grouped by ID dataframe, I only need the first 2 rows and last row from each gorup .So for group (1) from the [Age] column, I need the first row which is = 12, the second row which is = 13 and the last row which is = 15. Surely, I need to do the same for gorup 2 as well

  • and for the rest of the columns which are [name] & [time] in the goruped by data frame, I only need the last row, so for group(1), from the [name] column I need last row, which is = m, and from the [time] column I need last row which is = y.

  • by the end I will have one row only for each ID

this is my expected/desired output:

ID Age 1 Age 2 Age 3 name time
0 1 12 13 15 m Y
1 2 11 22 55 l N

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

枉心 2025-01-31 10:48:54
 df1 = df.groupby('ID').agg({'Age':lambda x:list(np.r_[x.head(2),x.tail(1)])})

df1[['name', 'time']] = df.groupby('ID')[['name', 'time']].last()

df1[['Age1', 'Age2', 'Age3']] = pd.DataFrame(df1['Age'].to_list(), index = df1.index)

df1.drop('Age', axis = 1).reset_index()

   ID name time  Age1  Age2  Age3
0   1    m    y    12    13    15
1   2    l    N    11    22    55
 df1 = df.groupby('ID').agg({'Age':lambda x:list(np.r_[x.head(2),x.tail(1)])})

df1[['name', 'time']] = df.groupby('ID')[['name', 'time']].last()

df1[['Age1', 'Age2', 'Age3']] = pd.DataFrame(df1['Age'].to_list(), index = df1.index)

df1.drop('Age', axis = 1).reset_index()

   ID name time  Age1  Age2  Age3
0   1    m    y    12    13    15
1   2    l    N    11    22    55
滥情空心 2025-01-31 10:48:53

尝试 groupby pivot

#keep only the needed data
grouped = df.groupby("ID", as_index=False).agg({"Age": lambda x: x.tolist()[:2]+[x.iat[-1]], "name": "last", "time": "last"}).explode("Age")

#get the count for the age columns
grouped["idx"] = grouped.groupby("ID").cumcount().add(1)

#pivot to get the required structure
output = grouped.pivot(["ID","name","time"],"idx","Age").add_prefix("Age").reset_index().rename_axis(None, axis=1)

>>> output
   ID name time Age1 Age2 Age3
0   1    m    y   12   13   15
1   2    l    N   11   22   55

Try with groupby and pivot:

#keep only the needed data
grouped = df.groupby("ID", as_index=False).agg({"Age": lambda x: x.tolist()[:2]+[x.iat[-1]], "name": "last", "time": "last"}).explode("Age")

#get the count for the age columns
grouped["idx"] = grouped.groupby("ID").cumcount().add(1)

#pivot to get the required structure
output = grouped.pivot(["ID","name","time"],"idx","Age").add_prefix("Age").reset_index().rename_axis(None, axis=1)

>>> output
   ID name time Age1 Age2 Age3
0   1    m    y   12   13   15
1   2    l    N   11   22   55
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文