python pandas通过两个或多个列将参数传递给分组

发布于 2025-01-20 16:30:17 字数 1228 浏览 0 评论 0原文

我试图使用一个函数通过将列传递到函数中来对多个列进行分组,但我似乎无法获得正确的语法并且不断收到错误。 该代码在将一列传递给组时起作用。 任何建议都将受到欢迎。

代码是:

groupby = (['HxHorse', 'Jockey', 'Trainer'])
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)

函数是:

def getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname):    
    hrs_agg = df_hrsall.groupby([groupby], as_index=False).agg({avgof: ['mean']})
    hrs_agg.columns = ["".join(x) for x in hrs_agg.columns.ravel()]
    hrs_agg.rename(columns = {avgcol:colname}, inplace = True)        
    df_racecard = pd.merge(left=df_racecard, right=hrs_agg[[groupby, colname]], left_on='Horse', right_on='HxHorse', how='left')
    df_racecard = df_racecard.drop(['HxHorse'], axis=1)
    return df_racecard 

该函数适用于:

groupby = 'HxHorse'
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)

它似乎失败于:

df_hrsall.groupby([groupby]

我尝试过 groupby = (['HxHorse', 'Jockey', 'Trainer']) 的变体例如groupby = [['HxHorse'], ['Jockey'], ['Trainer']]

I'm trying to use a function to group by several columns by passing the columns into the function but I can't seem to get the correct syntax and I keep getting errors.
The code works when passing one column to group.
Any advice would be most welcome.

The code is:

groupby = (['HxHorse', 'Jockey', 'Trainer'])
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)

And the function is:

def getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname):    
    hrs_agg = df_hrsall.groupby([groupby], as_index=False).agg({avgof: ['mean']})
    hrs_agg.columns = ["".join(x) for x in hrs_agg.columns.ravel()]
    hrs_agg.rename(columns = {avgcol:colname}, inplace = True)        
    df_racecard = pd.merge(left=df_racecard, right=hrs_agg[[groupby, colname]], left_on='Horse', right_on='HxHorse', how='left')
    df_racecard = df_racecard.drop(['HxHorse'], axis=1)
    return df_racecard 

The function works with :

groupby = 'HxHorse'
avgof = 'Hrs'
avgcol = 'Hrsmean'
colname = 'Hrs'
df_racecard = getagg(df_racecard, df_hrsall, groupby, avgof, avgcol, colname)

It seems to fail at:

df_hrsall.groupby([groupby]

I've tried variations of groupby = (['HxHorse', 'Jockey', 'Trainer']) such as groupby = [['HxHorse'], ['Jockey'], ['Trainer']]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文