在Python中从序列中删除项目的优雅方法?

发布于 2024-07-04 08:42:33 字数 513 浏览 8 评论 0原文

当我用 Python 编写代码时,我经常需要根据某些条件从列表或其他序列类型中删除项目。 我还没有找到一个优雅且高效的解决方案,因为从当前正在迭代的列表中删除项目是很糟糕的。 例如,你不能这样做:

for name in names:
    if name[-5:] == 'Smith':
        names.remove(name)

我通常最终会做这样的事情:

toremove = []
for name in names:
    if name[-5:] == 'Smith':
        toremove.append(name)
for name in toremove:
    names.remove(name)
del toremove

这是低效的,相当丑陋的,并且可能有错误(它如何处理多个“John Smith”条目?)。 有谁有更优雅的解决方案,或者至少是更有效的解决方案?

与词典一起使用的怎么样?

When I am writing code in Python, I often need to remove items from a list or other sequence type based on some criteria. I haven't found a solution that is elegant and efficient, as removing items from a list you are currently iterating through is bad. For example, you can't do this:

for name in names:
    if name[-5:] == 'Smith':
        names.remove(name)

I usually end up doing something like this:

toremove = []
for name in names:
    if name[-5:] == 'Smith':
        toremove.append(name)
for name in toremove:
    names.remove(name)
del toremove

This is innefficient, fairly ugly and possibly buggy (how does it handle multiple 'John Smith' entries?). Does anyone have a more elegant solution, or at least a more efficient one?

How about one that works with dictionaries?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

薄情伤 2024-07-11 08:42:33

有时过滤(使用过滤器或列表理解)不起作用。 当其他某个对象持有对您正在修改的列表的引用并且您需要就地修改该列表时,就会发生这种情况。

for name in names[:]:
    if name[-5:] == 'Smith':
        names.remove(name)

与原始代码的唯一区别是在 for 循环中使用 names[:] 而不是 names。 这样,代码会迭代列表的(浅)副本,并且删除会按预期工作。 由于列表复制很浅,因此速度相当快。

There are times when filtering (either using filter or a list comprehension) doesn't work. This happens when some other object is holding a reference to the list you're modifying and you need to modify the list in place.

for name in names[:]:
    if name[-5:] == 'Smith':
        names.remove(name)

The only difference from the original code is the use of names[:] instead of names in the for loop. That way the code iterates over a (shallow) copy of the list and the removals work as expected. Since the list copying is shallow, it's fairly quick.

情愿 2024-07-11 08:42:33
names = filter(lambda x: x[-5:] != "Smith", names);
names = filter(lambda x: x[-5:] != "Smith", names);
负佳期 2024-07-11 08:42:33

过滤器对此会很棒。 简单的例子:

names = ['mike', 'dave', 'jim']
filter(lambda x: x != 'mike', names)
['dave', 'jim']

编辑: Corey 的列表理解也很棒。

filter would be awesome for this. Simple example:

names = ['mike', 'dave', 'jim']
filter(lambda x: x != 'mike', names)
['dave', 'jim']

Edit: Corey's list comprehension is awesome too.

姐不稀罕 2024-07-11 08:42:33

过滤理解这两种解决方案都需要构建一个新列表。 我对 Python 内部结构了解不够,无法确定,但我认为更传统(但不太优雅)的方法可能会更有效:

names = ['Jones', 'Vai', 'Smith', 'Perez']

item = 0
while item <> len(names):
    name = names [item]
    if name=='Smith':
        names.remove(name)
    else:
        item += 1

print names

无论如何,对于简短的列表,我坚持使用以下任一方法之前提出的两种解决方案。

Both solutions, filter and comprehension requires building a new list. I don't know enough of the Python internals to be sure, but I think that a more traditional (but less elegant) approach could be more efficient:

names = ['Jones', 'Vai', 'Smith', 'Perez']

item = 0
while item <> len(names):
    name = names [item]
    if name=='Smith':
        names.remove(name)
    else:
        item += 1

print names

Anyway, for short lists, I stick with either of the two solutions proposed earlier.

兔小萌 2024-07-11 08:42:33

要回答有关使用字典的问题,您应该注意 Python 3.0 将包含 dict 推导式

>>> {i : chr(65+i) for i in range(4)}

同时,您可以通过这种方式进行准字典理解:

>>> dict([(i, chr(65+i)) for i in range(4)])

或者作为更直接的答案:

dict([(key, name) for key, name in some_dictionary.iteritems if name[-5:] != 'Smith'])

To answer your question about working with dictionaries, you should note that Python 3.0 will include dict comprehensions:

>>> {i : chr(65+i) for i in range(4)}

In the mean time, you can do a quasi-dict comprehension this way:

>>> dict([(i, chr(65+i)) for i in range(4)])

Or as a more direct answer:

dict([(key, name) for key, name in some_dictionary.iteritems if name[-5:] != 'Smith'])
行雁书 2024-07-11 08:42:33

如果要就地过滤列表并且列表大小相当大,那么前面答案中提到的基于 list.remove() 的算法可能不合适,因为它们的计算复杂度为 O(n^2) 。 在这种情况下,您可以使用以下 no-so pythonic 函数:

def filter_inplace(func, original_list):
  """ Filters the original_list in-place.

  Removes elements from the original_list for which func() returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """

  # Compact the list in-place.
  new_list_size = 0
  for item in original_list:
    if func(item):
      original_list[new_list_size] = item
      new_list_size += 1

  # Remove trailing items from the list.
  tail_size = len(original_list) - new_list_size
  while tail_size:
    original_list.pop()
    tail_size -= 1


a = [1, 2, 3, 4, 5, 6, 7]

# Remove even numbers from a in-place.
filter_inplace(lambda x: x & 1, a)

# Prints [1, 3, 5, 7]
print a

编辑:
实际上, https://stackoverflow.com/a/4639748/274937 的解决方案优于我的解决方案。 它更Pythonic并且运行速度更快。 因此,这是一个新的 filter_inplace() 实现:

def filter_inplace(func, original_list):
  """ Filters the original_list inplace.

  Removes elements from the original_list for which function returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """
  original_list[:] = [item for item in original_list if func(item)]

If the list should be filtered in-place and the list size is quite big, then algorithms mentioned in the previous answers, which are based on list.remove(), may be unsuitable, because their computational complexity is O(n^2). In this case you can use the following no-so pythonic function:

def filter_inplace(func, original_list):
  """ Filters the original_list in-place.

  Removes elements from the original_list for which func() returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """

  # Compact the list in-place.
  new_list_size = 0
  for item in original_list:
    if func(item):
      original_list[new_list_size] = item
      new_list_size += 1

  # Remove trailing items from the list.
  tail_size = len(original_list) - new_list_size
  while tail_size:
    original_list.pop()
    tail_size -= 1


a = [1, 2, 3, 4, 5, 6, 7]

# Remove even numbers from a in-place.
filter_inplace(lambda x: x & 1, a)

# Prints [1, 3, 5, 7]
print a

Edit:
Actually, the solution at https://stackoverflow.com/a/4639748/274937 is superior to mine solution. It is more pythonic and works faster. So, here is a new filter_inplace() implementation:

def filter_inplace(func, original_list):
  """ Filters the original_list inplace.

  Removes elements from the original_list for which function returns False.

  Algrithm's computational complexity is O(N), where N is the size
  of the original_list.
  """
  original_list[:] = [item for item in original_list if func(item)]
天生の放荡 2024-07-11 08:42:33

过滤器和列表理解对于您的示例来说是可以的,但是它们有几个问题:

  • 它们会复制您的列表并返回新的列表,当原始列表非常大时,这将是低效的 当原始列表非常大时,
  • 它们可能非常麻烦选择项目的标准(在您的情况下,如果 name[-5:] == 'Smith')更复杂,或者有几个条件。

对于非常大的列表,您原来的解决方案实际上更有效,即使我们同意它更难看。 但是,如果您担心可以有多个“John Smith”,可以通过根据位置而不是值进行删除来修复它:

names = ['Jones', 'Vai', 'Smith', 'Perez', 'Smith']

toremove = []
for pos, name in enumerate(names):
    if name[-5:] == 'Smith':
        toremove.append(pos)
for pos in sorted(toremove, reverse=True):
    del(names[pos])

print names

我们无法在不考虑列表大小的情况下选择解决方案,但对于大列表,我更喜欢您的两遍解决方案而不是过滤器或列表推导式

The filter and list comprehensions are ok for your example, but they have a couple of problems:

  • They make a copy of your list and return the new one, and that will be inefficient when the original list is really big
  • They can be really cumbersome when the criteria to pick items (in your case, if name[-5:] == 'Smith') is more complicated, or has several conditions.

Your original solution is actually more efficient for very big lists, even if we can agree it's uglier. But if you worry that you can have multiple 'John Smith', it can be fixed by deleting based on position and not on value:

names = ['Jones', 'Vai', 'Smith', 'Perez', 'Smith']

toremove = []
for pos, name in enumerate(names):
    if name[-5:] == 'Smith':
        toremove.append(pos)
for pos in sorted(toremove, reverse=True):
    del(names[pos])

print names

We can't pick a solution without considering the size of the list, but for big lists I would prefer your 2-pass solution instead of the filter or lists comprehensions

自此以后,行同陌路 2024-07-11 08:42:33

显而易见的答案是约翰和其他几个人给出的答案,即:

>>> names = [name for name in names if name[-5:] != "Smith"]       # <-- slower

但这有一个缺点,它创建一个新的列表对象,而不是重用原始对象。 我做了一些分析和实验,我想出的最有效的方法是:

>>> names[:] = (name for name in names if name[-5:] != "Smith")    # <-- faster

分配给“names[:]”基本上意味着“用以下值替换名称列表的内容”。 它与仅分配名称不同,因为它不会创建新的列表对象。 赋值语句的右侧是生成器表达式(请注意使用括号而不是方括号)。 这将导致 Python 遍历列表。

一些快速分析表明,这比列表理解方法快约 30%,比过滤方法快约 40%。

警告:虽然此解决方案比明显的解决方案更快,但它更加晦涩,并且依赖于更高级的 Python 技术。 如果您确实使用它,我建议您附上评论。 它可能只在您真正关心此特定操作的性能的情况下才值得使用(无论如何它都非常快)。 (在我使用它的情况下,我正在进行 A* 波束搜索,并使用它从搜索波束中删除搜索点。)

The obvious answer is the one that John and a couple other people gave, namely:

>>> names = [name for name in names if name[-5:] != "Smith"]       # <-- slower

But that has the disadvantage that it creates a new list object, rather than reusing the original object. I did some profiling and experimentation, and the most efficient method I came up with is:

>>> names[:] = (name for name in names if name[-5:] != "Smith")    # <-- faster

Assigning to "names[:]" basically means "replace the contents of the names list with the following value". It's different from just assigning to names, in that it doesn't create a new list object. The right hand side of the assignment is a generator expression (note the use of parentheses rather than square brackets). This will cause Python to iterate across the list.

Some quick profiling suggests that this is about 30% faster than the list comprehension approach, and about 40% faster than the filter approach.

Caveat: while this solution is faster than the obvious solution, it is more obscure, and relies on more advanced Python techniques. If you do use it, I recommend accompanying it with a comment. It's probably only worth using in cases where you really care about the performance of this particular operation (which is pretty fast no matter what). (In the case where I used this, I was doing A* beam search, and used this to remove search points from the search beam.)

一桥轻雨一伞开 2024-07-11 08:42:33

完成过滤的两种简单方法是:

  1. 使用filter

    names = filter(lambda name: name[-5:] != "Smith", names)

  2. 使用列表推导式:

    names = [名称中的名称的名称 if name[-5:] != "Smith"]

请注意,这两种情况都保留谓词函数计算结果为 True 的值,因此您必须反转逻辑(即您说“保留不姓史密斯的人”而不是“删除姓史密斯的人”)。

编辑有趣...两个人分别发布了我在发布我的答案时建议的答案。

Two easy ways to accomplish just the filtering are:

  1. Using filter:

    names = filter(lambda name: name[-5:] != "Smith", names)

  2. Using list comprehensions:

    names = [name for name in names if name[-5:] != "Smith"]

Note that both cases keep the values for which the predicate function evaluates to True, so you have to reverse the logic (i.e. you say "keep the people who do not have the last name Smith" instead of "remove the people who have the last name Smith").

Edit Funny... two people individually posted both of the answers I suggested as I was posting mine.

两相知 2024-07-11 08:42:33

您还可以向后迭代列表:

for name in reversed(names):
    if name[-5:] == 'Smith':
        names.remove(name)

这样做的优点是它不会创建新列表(如 filter 或列表理解)并使用迭代器而不是列表副本(如 [:])。

请注意,尽管在向后迭代时删除元素是安全的,但插入它们有些棘手。

You can also iterate backwards over the list:

for name in reversed(names):
    if name[-5:] == 'Smith':
        names.remove(name)

This has the advantage that it does not create a new list (like filter or a list comprehension) and uses an iterator instead of a list copy (like [:]).

Note that although removing elements while iterating backwards is safe, inserting them is somewhat trickier.

(り薆情海 2024-07-11 08:42:33

使用列表理解

list = [x for x in list if x[-5:] != "smith"]

Using a list comprehension

list = [x for x in list if x[-5:] != "smith"]
季末如歌 2024-07-11 08:42:33

这是我的 filter_inplace 实现,可用于就地过滤列表中的项目,在找到此页面之前,我独立地想出了这个方法。 它与 PabloG 发布的算法相同,只是变得更通用,因此您可以使用它来过滤列表,如果设置了反向,它还可以根据 comparisonFunc 从列表中删除<代码>真; 如果你愿意的话,可以说是一种反向过滤器。

def filter_inplace(conditionFunc, list, reversed=False):
    index = 0
    while index < len(list):
        item = list[index]

        shouldRemove = not conditionFunc(item)
        if reversed: shouldRemove = not shouldRemove

        if shouldRemove:
            list.remove(item)
        else:
            index += 1

Here is my filter_inplace implementation that can be used to filter items from a list in-place, I came up with this on my own independently before finding this page. It is the same algorithm as what PabloG posted, just made more generic so you can use it to filter lists in place, it is also able to remove from the list based on the comparisonFunc if reversed is set True; a sort-of of reversed filter if you will.

def filter_inplace(conditionFunc, list, reversed=False):
    index = 0
    while index < len(list):
        item = list[index]

        shouldRemove = not conditionFunc(item)
        if reversed: shouldRemove = not shouldRemove

        if shouldRemove:
            list.remove(item)
        else:
            index += 1
梦回旧景 2024-07-11 08:42:33

在一套的情况下。

toRemove = set([])  
for item in mySet:  
    if item is unwelcome:  
        toRemove.add(item)  
mySets = mySet - toRemove 

In the case of a set.

toRemove = set([])  
for item in mySet:  
    if item is unwelcome:  
        toRemove.add(item)  
mySets = mySet - toRemove 
一城柳絮吹成雪 2024-07-11 08:42:33

嗯,这显然是您使用的数据结构的问题。 例如使用哈希表。 某些实现支持每个键多个条目,因此可以弹出最新元素,也可以删除所有元素。

但这是,你将找到的解决方案是,通过不同的数据结构而不是算法来实现优雅。 如果它是排序的,也许你可以做得更好,但是列表上的迭代是你唯一的方法。

编辑:有人确实意识到他要求“效率”......所有这些建议的方法只是迭代列表,这与他的建议相同。

Well, this is clearly an issue with the data structure you are using. Use a hashtable for example. Some implementations support multiple entries per key, so one can either pop the newest element off, or remove all of them.

But this is, and what you're going to find the solution is, elegance through a different data structure, not algorithm. Maybe you can do better if it's sorted, or something, but iteration on a list is your only method here.

edit: one does realize he asked for 'efficiency'... all these suggested methods just iterate over the list, which is the same as what he suggested.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文