Pandas Groupby Python
我有一个数据集,其中包含国家的名称,还有一些其他信息,例如文件中的薪水。问题在于,我需要在country_id和City分组的档案中的雇员的平均工资(0,5000](5000,10000)和(10000,15000)。 我正在使用此方法,但结果表不像我想要的那样。你能帮我吗?
df = file.groupby(['country_id',"city"])['salary'].mean().reset_index(name="mean")
bins = [0, 5000]
df['binned'] = pd.cut(df['mean'], bins)
print(df)
I have a dataset with the names of the countries and some other information such as salary in the file. The problem is that I need to find mean salaries of employees in the file grouped by country_id and city in ranges (0, 5000] (5000, 10000] and (10000, 15000].
I was using this method but the resultant table is not as what I want. Can you help me with that?
df = file.groupby(['country_id',"city"])['salary'].mean().reset_index(name="mean")
bins = [0, 5000]
df['binned'] = pd.cut(df['mean'], bins)
print(df)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
工资箱范围始终
带有
salary_bin_number
,您可以通过使用以下代码来创建bin的列名,然后由
salary_range_range_str
和组成country
要计算每个country的平均工资,salary_range_str
。最后,将列
salary_range_str
转移到列。输出
I think if your range of the salary bin is always 5000, you can create the bin number of each row by using
/
operator andmath.ceil
With the
salary_bin_number
, you can create the column name of bin by using below codeThen group by
salary_range_str
andcountry
to calculate the average salary in eachcountry,salary_range_str
.Finally, pivot the column
salary_range_str
to columns.Output