python pandas通过两个或多个列将参数传递给分组
我试图使用一个函数通过将列传递到函数中来对多个列进行分组,但我似乎无法获得正确的语法并且不断收到错误。 该代码在将一列传递给组时起作用。 任何建议都将受到欢迎。
代码是:
groupby = (['HxHorse', 'Jockey', 'Trainer'])
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)
函数是:
def getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname):
hrs_agg = df_hrsall.groupby([groupby], as_index=False).agg({avgof: ['mean']})
hrs_agg.columns = ["".join(x) for x in hrs_agg.columns.ravel()]
hrs_agg.rename(columns = {avgcol:colname}, inplace = True)
df_racecard = pd.merge(left=df_racecard, right=hrs_agg[[groupby, colname]], left_on='Horse', right_on='HxHorse', how='left')
df_racecard = df_racecard.drop(['HxHorse'], axis=1)
return df_racecard
该函数适用于:
groupby = 'HxHorse'
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)
它似乎失败于:
df_hrsall.groupby([groupby]
我尝试过 groupby = (['HxHorse', 'Jockey', 'Trainer'])
的变体例如groupby = [['HxHorse'], ['Jockey'], ['Trainer']]
I'm trying to use a function to group by several columns by passing the columns into the function but I can't seem to get the correct syntax and I keep getting errors.
The code works when passing one column to group.
Any advice would be most welcome.
The code is:
groupby = (['HxHorse', 'Jockey', 'Trainer'])
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)
And the function is:
def getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname):
hrs_agg = df_hrsall.groupby([groupby], as_index=False).agg({avgof: ['mean']})
hrs_agg.columns = ["".join(x) for x in hrs_agg.columns.ravel()]
hrs_agg.rename(columns = {avgcol:colname}, inplace = True)
df_racecard = pd.merge(left=df_racecard, right=hrs_agg[[groupby, colname]], left_on='Horse', right_on='HxHorse', how='left')
df_racecard = df_racecard.drop(['HxHorse'], axis=1)
return df_racecard
The function works with :
groupby = 'HxHorse'
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)
It seems to fail at:
df_hrsall.groupby([groupby]
I've tried variations of groupby = (['HxHorse', 'Jockey', 'Trainer'])
such as groupby = [['HxHorse'], ['Jockey'], ['Trainer']]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论