Pandas:如何使用 groupby 选项获取列中每个值的计数
这是我得到的数据框:
data = {'Year' : [2021, 2021, 2021, 2022, 2022, 2022],
'Class':['A', 'A', 'B', 'A', 'C', 'C'],
'Animal':['dog|cat|bird', 'cat|dog', 'tiger|dog', 'cat|bird', 'dog|cat|rabbit', 'rabbit|dog|tiger',]}
df = pd.DataFrame(data)
所以 df 看起来像:
Year | Class | Animal |
---|---|---|
2021 | A | dog|cat|bird |
2021 | A | cat|dog |
2021 | B | Tiger|dog |
2022 | A | cat|bird |
2022 | C | dogs|cat| |
2022 | Crabbit | rabbit |dog|tiger |
我想做的是计算每个年份和班级中每种动物的数量。例如,我想获取以下数据框:
Year | Class | Animal | Count |
---|---|---|---|
2021 | A | dogs | 2 |
2021 | A | cat | 2 |
2021 | A | Bird | 1 |
2021 | B | Tiger | 1 |
2021 | B | Dog | 1 |
2022 | A | cat | 1 |
2022 | A | Bird | 1 |
2022 | C | Dog | 2 |
2022 | C | cat | 1 |
2022 | C | 兔 | 2 |
2022 | C | 虎 | 1 |
有有人对实现这一目标有什么建议吗?我会非常感激。
This is the dataframe I've got:
data = {'Year' : [2021, 2021, 2021, 2022, 2022, 2022],
'Class':['A', 'A', 'B', 'A', 'C', 'C'],
'Animal':['dog|cat|bird', 'cat|dog', 'tiger|dog', 'cat|bird', 'dog|cat|rabbit', 'rabbit|dog|tiger',]}
df = pd.DataFrame(data)
So the df looks like:
Year | Class | Animal |
---|---|---|
2021 | A | dog|cat|bird |
2021 | A | cat|dog |
2021 | B | tiger|dog |
2022 | A | cat|bird |
2022 | C | dog|cat|rabbit |
2022 | C | rabbit|dog|tiger |
What I'd like to do is to calculate the number of each animal in each year and class. For example, I want to get the following dataframe:
Year | Class | Animal | Count |
---|---|---|---|
2021 | A | dog | 2 |
2021 | A | cat | 2 |
2021 | A | bird | 1 |
2021 | B | tiger | 1 |
2021 | B | dog | 1 |
2022 | A | cat | 1 |
2022 | A | bird | 1 |
2022 | C | dog | 2 |
2022 | C | cat | 1 |
2022 | C | rabbit | 2 |
2022 | C | tiger | 1 |
Does anyone have any suggestions about achieving this? I'd be really appreciate it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

发布评论
评论(2)
神爱温柔2025-01-19 19:50:01
让我们尝试 str.get_dummies
然后 groupby
out = (df.Animal.str.get_dummies('|')
.groupby([df['Year'],df['Class']]).sum()
.mask(lambda x : x==0)
.rename_axis(['animal'],axis=1).stack().reset_index(name='Count')
Out[666]:
Year Class animal Count
0 2021 A bird 1.0
1 2021 A cat 2.0
2 2021 A dog 2.0
3 2021 B dog 1.0
4 2021 B tiger 1.0
5 2022 A bird 1.0
6 2022 A cat 1.0
7 2022 C cat 1.0
8 2022 C dog 2.0
9 2022 C rabbit 2.0
10 2022 C tiger 1.0
~没有更多了~
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
您可以使用一行代码来完成此操作:
输出:
You can do this with a one-liner:
Output: