Python基于另一列的多个单独的数据透视表来分隔excel文件

发布于 2025-01-09 01:24:32 字数 1101 浏览 4 评论 0原文

我正在尝试为 df 中不同列中的每个不同值生成多个单独的数据透视表（就像按每个数据透视表过滤的不同数据透视表）。在实际文件中，有数百个 R1，因此我们试图找到一种方法以某种方式对其进行循环以分别生成它们。

如果可能的话，有没有办法将每个数据透视表发送到一个单独的 Excel 文件，

import pandas as pd
df=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'R1': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'R2':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
df

因此，基本上，迈克的世界有 1 个 excel 文件，其员工计数为 R2，而史黛西的世界有 1 个 excel 文件，其员工计数为 R2（但在真实数据，这将针对数百个 R1 完成）

谢谢！

迈克·埃塞尔

Stacey excel

原文

I am trying to produce multiple separate pivot tables for each distinct value in a different column in my df (like a different pivot table filtered by each). In the actual file there are several hundred R1's so was trying to find a way to loop over this somehow to produce them separately.

If possible is there a way to then send each pivot to a separate excel file

import pandas as pd
df=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'R1': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'R2':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
df

So essentially 1 excel file for mike's world that has a count by employee by R2 and 1 excel for stacey's world that has a count by employee of R2 (but in the real data this would be done for the several hundred R1's)

thanks!

Mike excel

Stacey excel

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

末骤雨初歇 2025-01-16 01:24:32

虽然在写入工作表之前可能有更漂亮的方法来处理数据帧，但这为我提供了您正在寻找的结果。它应该与任意数量的“R1”一起缩放，因为“unique()”提供了 R1 中唯一名称的列表。然后将其分解为您需要的变量并将其写入给定文件路径上的工作表中。

import pandas as pd
data_jobs2=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'L2Name': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'L3Name':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
values = data_jobs2['L2Name'].unique()
filepath = 'Your\File\Path\Here\File_name.xlsx'
writer = pd.ExcelWriter(filepath, engine='openpyxl')
for i in values:
    series = data_jobs2[data_jobs2['L2Name'] == i].groupby(['L2Name','L3Name'])['Employee'].count().to_frame().reset_index()
    df_to_write = series.pivot(index = 'L2Name', columns='L3Name', values = 'Employee').reset_index().replace({i : 'Count of Employee'}).rename(columns={'L2Name':''}).set_index('')
    df_to_write['Grand Total'] = df_to_write.sum(1)
    df_to_write.to_excel(writer, sheet_name=i)
    display(df_to_write)
    display(series)
writer.save()
writer.close()

While there may be prettier ways in dealing with the dataframes prior to writing to the sheets, this provided me the results you were looking for. It should scale with any number of 'R1''s as "unique()" provides a list of the unique names within R1. Then breaks it down for the variables you need and writes it to a sheet on the given filepath.

import pandas as pd
data_jobs2=pd.DataFrame({'Employee':['1','2','3','4','5','6','7','8','9','10','11','12', '13', '14', '15', '16', '17', '18', '19', '20'],
'L2Name': ['mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'mike', 'stacey' , 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey', 'stacey'],
'L3Name':['bill', 'bill', 'bill', 'bill', 'bill', 'chris', 'chris', 'chris', 'jill',  'jill', 'jill', 'tom', 'tom', 'tom', 'tom', 'pete', 'pete', 'pete', 'pete', 'pete']})
values = data_jobs2['L2Name'].unique()
filepath = 'Your\File\Path\Here\File_name.xlsx'
writer = pd.ExcelWriter(filepath, engine='openpyxl')
for i in values:
    series = data_jobs2[data_jobs2['L2Name'] == i].groupby(['L2Name','L3Name'])['Employee'].count().to_frame().reset_index()
    df_to_write = series.pivot(index = 'L2Name', columns='L3Name', values = 'Employee').reset_index().replace({i : 'Count of Employee'}).rename(columns={'L2Name':''}).set_index('')
    df_to_write['Grand Total'] = df_to_write.sum(1)
    df_to_write.to_excel(writer, sheet_name=i)
    display(df_to_write)
    display(series)
writer.save()
writer.close()

回复收藏 0 原文

~没有更多了~