Python字典,查找相似之处
我有一本包含一千个项目的Python字典。每个项目本身就是一本字典。我正在寻找一种干净而优雅的方式来解析每个项目,并找到并找到它。创建模板。
这是各个词典结构的简化示例:
{'id': 1,
'template': None,
'height': 80,
'width': 120,
'length': 75,
'weight': 100}
由此,我想遍历一次,如果 1000 个词典中有 500 个共享相同的高度和宽度,请确定这一点,以便我可以根据该数据构建一个模板,并分配模板 ID 为“模板”。我可以构建一个巨大的引用哈希,但我希望有一种更干净、更优雅的方法来实现这一点。
实际数据包括接近 30 个键,其中一小部分需要从模板检查中排除。
I have a python dictionary with a thousand items. Each item is, itself, a dictionary. I'm looking for a clean and elegant way to parse through each item, and find & create templates.
Here's a simplified example of the individual dictionaries' structure:
{'id': 1,
'template': None,
'height': 80,
'width': 120,
'length': 75,
'weight': 100}
From this, I want to pass through once, and if, 500 of the 1000 share the same height and width, determine that, so I can build a template off that data, and assign the template id to 'template'. I can build a gigantic reference hash, but I'm hoping there's a cleaner more elegant way to accomplish this.
The actual data includes closer to 30 keys, of which a small subset need to be excluded from the template checking.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
给定字典
items
的字典:Given dict of dicts
items
:@eumiro 有一个出色的核心思想,即使用 itertools.groupby() 将具有共同值的项目批量排列在一起。然而,除了忽略首先使用 @Jochen Ritzel 指出的相同关键函数(并且在文档中也提到)对事物进行排序之外,他也没有解决您提到想要做的其他几件事。
下面是一个更完整、更长的答案。它确定模板并通过字典中的字典一次性分配它们。为此,在首先创建排序的项目列表后,它使用
groupby()
对它们进行批处理,如果每个组中有足够的项目,则创建一个模板并将其 ID 分配给每个成员。当我运行它时,输出如下:
@eumiro had an excellent core idea, namely that of using
itertools.groupby()
to arrange the items with common values together in batches. However besides neglecting to sort things first using the same key function as @Jochen Ritzel pointed-out (and is also mentioned in the documentation), he also didn't address the several other things you mentioned wanting to do.Below is a more complete and somewhat longer answer. It determines the templates and assigns them in one pass thought the dict-of-dicts. To do this, after first creating a sorted list of items, it uses
groupby()
to batch them, and if there are enough in each group, creates a template and assigns its ID to each member.When I run it, the following is the output: