Python基于另一列的多个单独的数据透视表来分隔excel文件

发布于 2025-01-09 01:24:32 字数 1101 浏览 1 评论 0原文

我正在尝试为 df 中不同列中的每个不同值生成多个单独的数据透视表(就像按每个数据透视表过滤的不同数据透视表)。在实际文件中,有数百个 R1,因此我们试图找到一种方法以某种方式对其进行循环以分别生成它们。

如果可能的话,有没有办法将每个数据透视表发送到一个单独的 Excel 文件,

import pandas as pd
df=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'R1': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'R2':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
df

因此,基本上,迈克的世界有 1 个 excel 文件,其员工计数为 R2,而史黛西的世界有 1 个 excel 文件,其员工计数为 R2(但在真实数据,这将针对数百个 R1 完成)

谢谢!

迈克·埃塞尔 输入图片此处描述

Stacey excel 输入图片此处描述

I am trying to produce multiple separate pivot tables for each distinct value in a different column in my df (like a different pivot table filtered by each). In the actual file there are several hundred R1's so was trying to find a way to loop over this somehow to produce them separately.

If possible is there a way to then send each pivot to a separate excel file

import pandas as pd
df=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'R1': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'R2':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
df

So essentially 1 excel file for mike's world that has a count by employee by R2 and 1 excel for stacey's world that has a count by employee of R2 (but in the real data this would be done for the several hundred R1's)

thanks!

Mike excel
enter image description here

Stacey excel
enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

末骤雨初歇 2025-01-16 01:24:32

虽然在写入工作表之前可能有更漂亮的方法来处理数据帧,但这为我提供了您正在寻找的结果。它应该与任意数量的“R1”一起缩放,因为“unique()”提供了 R1 中唯一名称的列表。然后将其分解为您需要的变量并将其写入给定文件路径上的工作表中。

import pandas as pd
data_jobs2=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'L2Name': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'L3Name':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
values = data_jobs2['L2Name'].unique()
filepath = 'Your\File\Path\Here\File_name.xlsx'
writer = pd.ExcelWriter(filepath, engine='openpyxl')
for i in values:
    series = data_jobs2[data_jobs2['L2Name'] == i].groupby(['L2Name','L3Name'])['Employee'].count().to_frame().reset_index()
    df_to_write = series.pivot(index = 'L2Name', columns='L3Name', values = 'Employee').reset_index().replace({i : 'Count of Employee'}).rename(columns={'L2Name':''}).set_index('')
    df_to_write['Grand Total'] = df_to_write.sum(1)
    df_to_write.to_excel(writer, sheet_name=i)
    display(df_to_write)
    display(series)
writer.save()
writer.close()

While there may be prettier ways in dealing with the dataframes prior to writing to the sheets, this provided me the results you were looking for. It should scale with any number of 'R1''s as "unique()" provides a list of the unique names within R1. Then breaks it down for the variables you need and writes it to a sheet on the given filepath.

import pandas as pd
data_jobs2=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'L2Name': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'L3Name':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
values = data_jobs2['L2Name'].unique()
filepath = 'Your\File\Path\Here\File_name.xlsx'
writer = pd.ExcelWriter(filepath, engine='openpyxl')
for i in values:
    series = data_jobs2[data_jobs2['L2Name'] == i].groupby(['L2Name','L3Name'])['Employee'].count().to_frame().reset_index()
    df_to_write = series.pivot(index = 'L2Name', columns='L3Name', values = 'Employee').reset_index().replace({i : 'Count of Employee'}).rename(columns={'L2Name':''}).set_index('')
    df_to_write['Grand Total'] = df_to_write.sum(1)
    df_to_write.to_excel(writer, sheet_name=i)
    display(df_to_write)
    display(series)
writer.save()
writer.close()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文