python列表/dict理解，将另一个键在同一dict中的另一个键概括键

发布于 2025-02-02 00:05:58 字数 1038 浏览 2 评论 0原文

一直在考虑如何将其转换为一个衬里：

activities = 
[ {'type': 'Run', 'distance': 12345, 'other_stuff': other ...},                   
  {'type': 'Ride', 'distance': 12345, 'other_stuff': other ...},  
  {'type': 'Swim', 'distance': 12345, 'other_stuff': other ...} ]

当前正在使用：

grouped_distance = defaultdict(int)
for activity in activities:  
    act_type = activity['type']
    grouped_distance[act_type] += activity['distance']

# {'Run': 12345, 'Ride': 12345, 'Swim': 12345}

尝试过
grouped_distance = {活动['type']：[活动中的活动] 这在没有定义的活动['type']的情况下无法正常工作。

编辑
修复@samwise 更新的一些变量错字

：在发布的所有解决方案上做了一些基准。 100万个项目，有10种不同类型：

方法1（计数器）：7.43S
方法2（itertools @chepner）：8.64S
方法3（@dmig组）：19.34S
方法4（pandas @db）：32.73S
方法5（DICS @db）：

在Raspberry Pi 4上测试的10.95s，以进一步查看差异。如果我错误地“命名”该方法，请纠正我。

谢谢大家，@dmig，@mark， @juanpa.arrivillaga激起了我对性能的兴趣。较短/不动摇的性能更高。想只是问我是否以一种衬里形式编写它，以使其看起来更整洁，但是我学到的远不止于此。

原文

Been thinking how to convert this to a one liner if possible:

activities = 
[ {'type': 'Run', 'distance': 12345, 'other_stuff': other ...},                   
  {'type': 'Ride', 'distance': 12345, 'other_stuff': other ...},  
  {'type': 'Swim', 'distance': 12345, 'other_stuff': other ...} ]

currently am using:

grouped_distance = defaultdict(int)
for activity in activities:  
    act_type = activity['type']
    grouped_distance[act_type] += activity['distance']

# {'Run': 12345, 'Ride': 12345, 'Swim': 12345}

Have tried
grouped_distance = {activity['type']:[sum(activity['distance']) for activity in activities]}
this is not working where it says activity['type'] is not defined.

Edited
Fix some variables typo as noticed by @Samwise

Update:
Did some a benchmark on all the solution that was posted.
10 millions items, with 10 different types:

Method 1 (Counter): 7.43s
Method 2 (itertools @chepner): 8.64s
Method 3 (groups @Dmig): 19.34s
Method 4 (pandas @d.b): 32.73s
Method 5 (Dict @d.b): 10.95s

Tested on Raspberry Pi 4 to further see the differences.
Do correct me if I "name" the method wrongly.

Thank you everyone and @Dmig, @Mark, @juanpa.arrivillaga has piqued my interest in performance. Shorter/Neater ≠ Higher Performance. Wanted to just asked if I write it in a one liner form for it to look neater, but I have learnt a lot more than that.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冬天旳寂寞 2025-02-09 00:05:58

您的解决方案是很好的，但是如果您真的想要一个单位：

act = [{'type': 'run', 'distance': 4}, {'type': 'run', 'distance': 3}, {'type': 'swim', 'distance': 5}]

groups = {
  t: sum(i['distance'] for i in act if i['type'] == t)
  for t in {i['type'] for i in act}  # set with all possible activities
}

print(groups)  # {'run': 7, 'swim': 5}

upd：我进行了一些性能研究，将此答案与使用group的答案进行了比较（.. 。））。事实证明，在一千万个条目和10种不同类型的类型上，此方法输给了group（Antedby（...））带有18.14对10.12 秒>。因此，虽然它更可读性，但在更大的列表上效率较低，尤其是其中的类型更具不同的类型（因为它每次迭代初始列表一次，每种不同类型）。

但是请注意，从疑问中进行的最初直接方法只需5秒！

这个答案仅是出于教育目的显示单线，问题的解决方案的表现要好得多。您不应该使用它而不是相关的使用，除非我说，除非您真的想要/需要单线。

Your solution is good as it is, but if you really want one-liner:

act = [{'type': 'run', 'distance': 4}, {'type': 'run', 'distance': 3}, {'type': 'swim', 'distance': 5}]

groups = {
  t: sum(i['distance'] for i in act if i['type'] == t)
  for t in {i['type'] for i in act}  # set with all possible activities
}

print(groups)  # {'run': 7, 'swim': 5}

UPD: I've made some performance research, comparing this answer to answer which uses group(sortedby(...)). Turns out, on ten million entries and 10 different types, this approach loses to group(sortedby(...)) with 18.14 seconds against 10.12. So, while it is more readable, it is less efficient on bigger lists and especially with more distinct types in it (because it iterates initial list one time per each distinct type).

But take note, the initial straight way to do it from question would take only 5 seconds!

This answer is only to show one-liner for educational purposes, solution from question has much better performance. You should not use this instead of one in question, unless, as I said, you really want/need one-liner.

回复收藏 0 原文

浮萍、无处依 2025-02-09 00:05:58

使用 itertools.groups.groupbyby 。

from operator import itemgetter


by_type = itemgetter('type')
distance = itemgetter('distance')
result = {
    k: sum(map(distance, v))
    for k, v in groupby(sorted(activities, key=by_type), by_type)
    }

在groupby实例上进行迭代时，k将是活动类型之一，v将是具有类型的活动的疑问k。

Use itertools.groupby.

from operator import itemgetter


by_type = itemgetter('type')
distance = itemgetter('distance')
result = {
    k: sum(map(distance, v))
    for k, v in groupby(sorted(activities, key=by_type), by_type)
    }

When iterating over the groupby instance, k will be one of the activity types, and v will be an iterable of activities having type k.

回复收藏 0 原文

~没有更多了~