Python,从元组列表中删除重复项
我有以下列表:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12)),
('mail', 1045, datetime.datetime(2010, 8, 13)),
('name', 3, datetime.datetime(2011, 11, 3))]
我想从列表中删除与元组中的第一项一致的项目,其中日期不是最新的。换句话说,我需要得到这个:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12))]
I have the following list:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12)),
('mail', 1045, datetime.datetime(2010, 8, 13)),
('name', 3, datetime.datetime(2011, 11, 3))]
And I want to remove items from the list with coinciding first item in a tuple where date is not the latest. In other words I need to get this:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12))]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以使用字典来存储迄今为止为给定键找到的最高值:
You can use a dictionary to store the highest value found for a given key so far:
以下方法使用字典来覆盖具有相同键的条目。由于列表是按日期排序的,因此旧条目会被新条目覆盖。
或者,对于更紧凑(但可读性差得多)的东西:
更新
如果列表已经(或大部分)按日期排序,则此方法将相当快。如果不是,特别是如果列表很大,那么这可能不是最好的方法。
对于未排序的列表,通过先按键排序,然后按日期排序,您可能会获得一些性能改进。即
sorted(L, key=lambda L: (L[0],L[2]))
。或者,更好的是,寻找 Space_C0wb0y 的答案。
The following approach uses a dictionary to overwrite entries with the same key. Since the list is sorted by the date, older entries get overwritten by newer ones.
Or, for something a lot more compact (but much less readable):
Update
This method would be reasonably quick if the list is already (or mostly) sorted by date. If it isn't, and especially if it is a large list, then this may not be the best approach.
For unsorted lists, you will likely get a some performance improvement by sorting by the key first, then the date. i.e.
sorted(L, key=lambda L: (L[0],L[2]))
.Or, better yet, go for Space_C0wb0y's answer.
您可以通过对列表进行排序并通过 d[2] 获取最高值来实现:
You can do it via sorting the list and getting the highest values by d[2]:
干得好。
使用第一个和第三个元组元素作为键对列表进行排序,然后使用哈希表删除重复项。
Here you go.
That sorts the list using the first and third tuple elements as keys, then removes duplicates by using a hash table.