嵌套字典和 Openpyxl
我正在尝试从 xlsx 文件中提取数据,但遇到了一些问题。 我的问题在于最后一个循环。应在字典中重复分配的值。所有类别的每个子类别的值都将被覆盖在其自身和所有其他类别上,而不是为每个类别和子类别提供相应的值。
我的直觉告诉我这与词典交互的方式有关,但我的经验不允许我自己解决这个问题。
提前致谢。
代码如下
#Excelreader.py
import openpyxl, os, pprint
os.chdir('Directory')
print('Opening workbook... ')
wb = openpyxl.load_workbook('Example.xlsx')
sheet = wb['Sheet']
all_dic ={}
part_dic = {}
print('Reading ')
#Assigning the categories to the keys of the main dictionary
for rows in range(2, sheet.max_row + 1):
cat = sheet.cell(row = rows, column = 1).value
all_dic[cat] = {}
#Making a subdictionary for each subcategory
for col in range(2, sheet.max_column +1):
subcat = sheet.cell(row = 1, column = col).value
part_dic[subcat] = float
#Filling out the main dictionary with subdictionaries
for key in all_dic.keys():
all_dic[key] = part_dic
#Reading all the data from the cells and assigning the correspondent values in each subdictionary
for rows in range(2, sheet.max_row + 1):
for col in range(2, sheet.max_column +1):
cat = sheet.cell(row = rows, column = 1).value
subcat = sheet.cell(row = 1, column = col).value
amount = sheet.cell(row = rows, column = col).value
all_dic[cat][subcat] = amount
#Writing the dictionary into a .py that will be imported in the main script
print('Writing results... ')
resultFile = open('Dump.py','w')
resultFile.write('all_data = '+ pprint.pformat(all_dic))
resultFile.close()
print('Done')
dump.py 文件如下所示
all_data = {'Cat A': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat B': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat C': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat D': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat E': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0}}
excel 文件如下所示 以下
I'm trying to extract data from a xlsx file and I've been having some issues.
My issue lies in the last loop. The values that should be assigned in the dictionary repeat. Instead of having the corresponding values for each category and subcategory, the values for each subcategory of all categories is being overwritten on themselves and all other categories.
My hunch tells me it has to do with the way dictionaries interact but my experience doesn't allow me to solve it by myself.
Thanks in advance.
The code is the following
#Excelreader.py
import openpyxl, os, pprint
os.chdir('Directory')
print('Opening workbook... ')
wb = openpyxl.load_workbook('Example.xlsx')
sheet = wb['Sheet']
all_dic ={}
part_dic = {}
print('Reading ')
#Assigning the categories to the keys of the main dictionary
for rows in range(2, sheet.max_row + 1):
cat = sheet.cell(row = rows, column = 1).value
all_dic[cat] = {}
#Making a subdictionary for each subcategory
for col in range(2, sheet.max_column +1):
subcat = sheet.cell(row = 1, column = col).value
part_dic[subcat] = float
#Filling out the main dictionary with subdictionaries
for key in all_dic.keys():
all_dic[key] = part_dic
#Reading all the data from the cells and assigning the correspondent values in each subdictionary
for rows in range(2, sheet.max_row + 1):
for col in range(2, sheet.max_column +1):
cat = sheet.cell(row = rows, column = 1).value
subcat = sheet.cell(row = 1, column = col).value
amount = sheet.cell(row = rows, column = col).value
all_dic[cat][subcat] = amount
#Writing the dictionary into a .py that will be imported in the main script
print('Writing results... ')
resultFile = open('Dump.py','w')
resultFile.write('all_data = '+ pprint.pformat(all_dic))
resultFile.close()
print('Done')
The dump.py file looks like this
all_data = {'Cat A': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat B': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat C': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat D': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0},
'Cat E': {'Subcat K': 1,
'Subcat L': 2,
'Subcat X': 3,
'Subcat Y': 0,
'Subcat Z': 0}}
And the excel file looks like the following
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当您执行命令 all_dic[key] = part_dic 时,您会将字典链接到同一对象,因此当您更新“Cat A”时,它还会更新 Cats B、C、D 和 E 等每只猫。然后,字典将包含所有猫的最后一行值。
使用复制来制作新对象
只是为了兴趣。您不需要所有循环来创建“all_dic”字典,只需循环一个就可以完成所有操作。如果我正确地阅读了所有内容,那么对下面的代码进行的更改将实现相同的效果;
When you execute the command all_dic[key] = part_dic you link the dictionaries to the same object so when you are updating 'Cat A' it also updates Cats B, C, D and E and so on with each Cat. The dictionary will then contain the last row values for all Cats.
Use copy instead to make a new object
Just for interest. You don't need all the loops to create the 'all_dic' dictionary just loop one should do everything. If I'm reading everything correctly the changes to your code below will achieve the same thing;