使用多个键将文本文件转换为字典
我正在尝试使用以下格式的文本文件阅读:
student
first name: John
last name: Doe
grade: 9
gpa: 4.0
school
name: Richard High School
city: Kansas City
####
student
first name: Jane
last name: Doe
grade: 10
gpa: 3.0
school
name: Richard High School
city: Kansas City
进入Python词典。试图最终结果看起来像:
{0:{'student':{'first name': 'John',
'last name': 'Doe',
'grade': '9',
'gpa': '4.0'},
"school": {'name': 'Richard High School',
'city': 'Kansas City'},
1:{'student':{'first name': 'Jane',
'last name': 'Doe',
'grade': '10',
'gpa': '3.0'},
'school': {'name': 'Richard High School',
'city': 'Kansas City'}
}
到目前为止,我知道如何处理内部钥匙:
with open('<filename>') as f:
dict = {}
for line in f:
x, y = line.split(": ")
dict[x] = y
print(dict)
但是除此之外,我仍然陷入困境。
I'm trying to read in a text file formatted like the following:
student
first name: John
last name: Doe
grade: 9
gpa: 4.0
school
name: Richard High School
city: Kansas City
####
student
first name: Jane
last name: Doe
grade: 10
gpa: 3.0
school
name: Richard High School
city: Kansas City
into a Python dictionary. Trying to have the end result look like:
{0:{'student':{'first name': 'John',
'last name': 'Doe',
'grade': '9',
'gpa': '4.0'},
"school": {'name': 'Richard High School',
'city': 'Kansas City'},
1:{'student':{'first name': 'Jane',
'last name': 'Doe',
'grade': '10',
'gpa': '3.0'},
'school': {'name': 'Richard High School',
'city': 'Kansas City'}
}
So far, I know how to handle the inner keys with:
with open('<filename>') as f:
dict = {}
for line in f:
x, y = line.split(": ")
dict[x] = y
print(dict)
But beyond that I'm stuck.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是一个可能的解决方案:
输入文件:
输出:
我不完全知道您的用例是什么,但是如果您具有如此严格的格式,您应该真正考虑使用数据类。
That's a possible solution:
Input file:
Output:
I do not exactly know what your use-case is, but you should really think about using data-classes if you have such a strict format.
如果您的数据的模式完全如您所写,并且您不介意拥有平面词典,每个学生一个:
输出:
我没有看到您想要的数据结构的好处,因此我的建议以获得更有利于表格数据分析的东西。
编辑:现在我们正在谈论 Pandas,
获取每个年级的学生数量:
您还可以按任意数量的列进行分组,例如按年级和学校:
If your data are patterned exactly as you have written, and you don't mind having flat dictionaries, one per student:
Output:
I don't see the benefit of your desired data structure, hence my suggestion for something more conducive to tabular data analysis.
EDIT: now that we're talking about Pandas,
Getting the count of students in each grade:
You can also group by any number of columns, for instance by grade and school:
如果我正确地理解了你的目标,这就是答案。但要小心,它基于正则表达式,您现在可以在 regex101.com 中了解更多信息,
如果我看到像“”这样的行并且充满了空! (断线)
第二,我检查行格式是否类似于“key:value”,如果不是,那么它是主键,我将其添加到主字典中,否则,我将其添加到主字典中的最后一个字典中
if i realized your target correctly, this is answer. but be careful, it is based on regex and you can now more about it in regex101.com
in fist if, i scape lines that are somthing like " " and full of empty! (break lines)
in second, i check that if line format is like "key: value", if not, so it is main key and I add it in main dict and else, i add it in my last dict in main dict
您可以这样做,但请记住,此方法非常特定于原始问题中定义的输入和输出:
输出:
You could do it like this but bear in mind that this method is very specific to the input and output as defined in the original question:
Output: