在Python中从序列中删除项目的优雅方法?
当我用 Python 编写代码时,我经常需要根据某些条件从列表或其他序列类型中删除项目。 我还没有找到一个优雅且高效的解决方案,因为从当前正在迭代的列表中删除项目是很糟糕的。 例如,你不能这样做:
for name in names:
if name[-5:] == 'Smith':
names.remove(name)
我通常最终会做这样的事情:
toremove = []
for name in names:
if name[-5:] == 'Smith':
toremove.append(name)
for name in toremove:
names.remove(name)
del toremove
这是低效的,相当丑陋的,并且可能有错误(它如何处理多个“John Smith”条目?)。 有谁有更优雅的解决方案,或者至少是更有效的解决方案?
与词典一起使用的怎么样?
When I am writing code in Python, I often need to remove items from a list or other sequence type based on some criteria. I haven't found a solution that is elegant and efficient, as removing items from a list you are currently iterating through is bad. For example, you can't do this:
for name in names:
if name[-5:] == 'Smith':
names.remove(name)
I usually end up doing something like this:
toremove = []
for name in names:
if name[-5:] == 'Smith':
toremove.append(name)
for name in toremove:
names.remove(name)
del toremove
This is innefficient, fairly ugly and possibly buggy (how does it handle multiple 'John Smith' entries?). Does anyone have a more elegant solution, or at least a more efficient one?
How about one that works with dictionaries?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
有时过滤(使用过滤器或列表理解)不起作用。 当其他某个对象持有对您正在修改的列表的引用并且您需要就地修改该列表时,就会发生这种情况。
与原始代码的唯一区别是在 for 循环中使用
names[:]
而不是names
。 这样,代码会迭代列表的(浅)副本,并且删除会按预期工作。 由于列表复制很浅,因此速度相当快。There are times when filtering (either using filter or a list comprehension) doesn't work. This happens when some other object is holding a reference to the list you're modifying and you need to modify the list in place.
The only difference from the original code is the use of
names[:]
instead ofnames
in the for loop. That way the code iterates over a (shallow) copy of the list and the removals work as expected. Since the list copying is shallow, it's fairly quick.过滤器对此会很棒。 简单的例子:
编辑: Corey 的列表理解也很棒。
filter would be awesome for this. Simple example:
Edit: Corey's list comprehension is awesome too.
过滤和理解这两种解决方案都需要构建一个新列表。 我对 Python 内部结构了解不够,无法确定,但我认为更传统(但不太优雅)的方法可能会更有效:
无论如何,对于简短的列表,我坚持使用以下任一方法之前提出的两种解决方案。
Both solutions, filter and comprehension requires building a new list. I don't know enough of the Python internals to be sure, but I think that a more traditional (but less elegant) approach could be more efficient:
Anyway, for short lists, I stick with either of the two solutions proposed earlier.
要回答有关使用字典的问题,您应该注意 Python 3.0 将包含 dict 推导式:
同时,您可以通过这种方式进行准字典理解:
或者作为更直接的答案:
To answer your question about working with dictionaries, you should note that Python 3.0 will include dict comprehensions:
In the mean time, you can do a quasi-dict comprehension this way:
Or as a more direct answer:
如果要就地过滤列表并且列表大小相当大,那么前面答案中提到的基于 list.remove() 的算法可能不合适,因为它们的计算复杂度为 O(n^2) 。 在这种情况下,您可以使用以下 no-so pythonic 函数:
编辑:
实际上, https://stackoverflow.com/a/4639748/274937 的解决方案优于我的解决方案。 它更Pythonic并且运行速度更快。 因此,这是一个新的 filter_inplace() 实现:
If the list should be filtered in-place and the list size is quite big, then algorithms mentioned in the previous answers, which are based on list.remove(), may be unsuitable, because their computational complexity is O(n^2). In this case you can use the following no-so pythonic function:
Edit:
Actually, the solution at https://stackoverflow.com/a/4639748/274937 is superior to mine solution. It is more pythonic and works faster. So, here is a new filter_inplace() implementation:
过滤器和列表理解对于您的示例来说是可以的,但是它们有几个问题:
对于非常大的列表,您原来的解决方案实际上更有效,即使我们同意它更难看。 但是,如果您担心可以有多个“John Smith”,可以通过根据位置而不是值进行删除来修复它:
我们无法在不考虑列表大小的情况下选择解决方案,但对于大列表,我更喜欢您的两遍解决方案而不是过滤器或列表推导式
The filter and list comprehensions are ok for your example, but they have a couple of problems:
Your original solution is actually more efficient for very big lists, even if we can agree it's uglier. But if you worry that you can have multiple 'John Smith', it can be fixed by deleting based on position and not on value:
We can't pick a solution without considering the size of the list, but for big lists I would prefer your 2-pass solution instead of the filter or lists comprehensions
显而易见的答案是约翰和其他几个人给出的答案,即:
但这有一个缺点,它创建一个新的列表对象,而不是重用原始对象。 我做了一些分析和实验,我想出的最有效的方法是:
分配给“names[:]”基本上意味着“用以下值替换名称列表的内容”。 它与仅分配名称不同,因为它不会创建新的列表对象。 赋值语句的右侧是生成器表达式(请注意使用括号而不是方括号)。 这将导致 Python 遍历列表。
一些快速分析表明,这比列表理解方法快约 30%,比过滤方法快约 40%。
警告:虽然此解决方案比明显的解决方案更快,但它更加晦涩,并且依赖于更高级的 Python 技术。 如果您确实使用它,我建议您附上评论。 它可能只在您真正关心此特定操作的性能的情况下才值得使用(无论如何它都非常快)。 (在我使用它的情况下,我正在进行 A* 波束搜索,并使用它从搜索波束中删除搜索点。)
The obvious answer is the one that John and a couple other people gave, namely:
But that has the disadvantage that it creates a new list object, rather than reusing the original object. I did some profiling and experimentation, and the most efficient method I came up with is:
Assigning to "names[:]" basically means "replace the contents of the names list with the following value". It's different from just assigning to names, in that it doesn't create a new list object. The right hand side of the assignment is a generator expression (note the use of parentheses rather than square brackets). This will cause Python to iterate across the list.
Some quick profiling suggests that this is about 30% faster than the list comprehension approach, and about 40% faster than the filter approach.
Caveat: while this solution is faster than the obvious solution, it is more obscure, and relies on more advanced Python techniques. If you do use it, I recommend accompanying it with a comment. It's probably only worth using in cases where you really care about the performance of this particular operation (which is pretty fast no matter what). (In the case where I used this, I was doing A* beam search, and used this to remove search points from the search beam.)
完成过滤的两种简单方法是:
使用
filter
:names = filter(lambda name: name[-5:] != "Smith", names)
使用列表推导式:
names = [名称中的名称的名称 if name[-5:] != "Smith"]
请注意,这两种情况都保留谓词函数计算结果为
True
的值,因此您必须反转逻辑(即您说“保留不姓史密斯的人”而不是“删除姓史密斯的人”)。编辑有趣...两个人分别发布了我在发布我的答案时建议的答案。
Two easy ways to accomplish just the filtering are:
Using
filter
:names = filter(lambda name: name[-5:] != "Smith", names)
Using list comprehensions:
names = [name for name in names if name[-5:] != "Smith"]
Note that both cases keep the values for which the predicate function evaluates to
True
, so you have to reverse the logic (i.e. you say "keep the people who do not have the last name Smith" instead of "remove the people who have the last name Smith").Edit Funny... two people individually posted both of the answers I suggested as I was posting mine.
您还可以向后迭代列表:
这样做的优点是它不会创建新列表(如
filter
或列表理解)并使用迭代器而不是列表副本(如[:]
)。请注意,尽管在向后迭代时删除元素是安全的,但插入它们有些棘手。
You can also iterate backwards over the list:
This has the advantage that it does not create a new list (like
filter
or a list comprehension) and uses an iterator instead of a list copy (like[:]
).Note that although removing elements while iterating backwards is safe, inserting them is somewhat trickier.
使用列表理解
Using a list comprehension
这是我的
filter_inplace
实现,可用于就地过滤列表中的项目,在找到此页面之前,我独立地想出了这个方法。 它与 PabloG 发布的算法相同,只是变得更通用,因此您可以使用它来过滤列表,如果设置了反向,它还可以根据comparisonFunc
从列表中删除<代码>真; 如果你愿意的话,可以说是一种反向过滤器。Here is my
filter_inplace
implementation that can be used to filter items from a list in-place, I came up with this on my own independently before finding this page. It is the same algorithm as what PabloG posted, just made more generic so you can use it to filter lists in place, it is also able to remove from the list based on thecomparisonFunc
if reversed is setTrue
; a sort-of of reversed filter if you will.在一套的情况下。
In the case of a set.
嗯,这显然是您使用的数据结构的问题。 例如使用哈希表。 某些实现支持每个键多个条目,因此可以弹出最新元素,也可以删除所有元素。
但这是,你将找到的解决方案是,通过不同的数据结构而不是算法来实现优雅。 如果它是排序的,也许你可以做得更好,但是列表上的迭代是你唯一的方法。
编辑:有人确实意识到他要求“效率”......所有这些建议的方法只是迭代列表,这与他的建议相同。
Well, this is clearly an issue with the data structure you are using. Use a hashtable for example. Some implementations support multiple entries per key, so one can either pop the newest element off, or remove all of them.
But this is, and what you're going to find the solution is, elegance through a different data structure, not algorithm. Maybe you can do better if it's sorted, or something, but iteration on a list is your only method here.
edit: one does realize he asked for 'efficiency'... all these suggested methods just iterate over the list, which is the same as what he suggested.