删除列表中连续重复元素的优雅方法

发布于 2024-12-08 02:00:48 字数 1431 浏览 1 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

︶葆Ⅱㄣ 2024-12-15 02:00:48

这是基于卡尔的答案的版本 不需要列表的副本(tmp、切片和压缩列表)。对于大型列表,izip 比 (Python 2) zip 快得多。 chain 比切片稍慢,但不需要 tmp 对象或列表的副本。 islice 加上制作 tmp 速度更快一点,但需要更多内存并且不太优雅。

from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
                       chain((None,), li),
                       li) if x != y != z]

timeit 测试表明,它的速度大约是 Karl 的答案或我针对短组的最快 groupby 版本的两倍。

如果您的列表可以包含 None,请确保使用 None 以外的值(例如 object())。

如果您需要它处理不是序列的迭代器/可迭代对象,或者您的组很长,请使用此版本:

[key for key, group in groupby(li)
        if (next(group) or True) and next(group, None) is None]

timeit 显示对于 1,000 个项目组,它比其他版本快大约十倍。

早期、缓慢的版本:

[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]

Here is a version based on Karl's answer which doesn't requires copies of the list (tmp, the slices, and the zipped list). izip is significantly faster than (Python 2) zip for large lists. chain is slightly slower than slicing but doesn't require a tmp object or copies of the list. islice plus making a tmp is a bit faster, but requires more memory and is less elegant.

from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
                       chain((None,), li),
                       li) if x != y != z]

A timeit test shows it to be approximately twice as fast as Karl's answer or my fastest groupby version for short groups.

Make sure to use a value other than None (like object()) if your list can contain Nones.

Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:

[key for key, group in groupby(li)
        if (next(group) or True) and next(group, None) is None]

timeit shows it's about ten times faster than the other version for 1,000 item groups.

Earlier, slow versions:

[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]
倒数 2024-12-15 02:00:48

agf的答案很好如果组的大小很小,但如果一行中有足够的重复项,则不对这些组“求和 1”会更有效

[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]

agf's answer is good if the size of the groups is small, but if there are enough duplicates in a row, it will be more efficient not to "sum 1" over those groups

[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]
淡淡離愁欲言轉身 2024-12-15 02:00:48
tmp = [object()] + li + [object()]
re = [y for x, y, z in zip(tmp[2:], tmp[1:-1], tmp[:-2]) if y != x and y != z]
tmp = [object()] + li + [object()]
re = [y for x, y, z in zip(tmp[2:], tmp[1:-1], tmp[:-2]) if y != x and y != z]
享受孤独 2024-12-15 02:00:48

其他解决方案使用各种 itertools 帮助程序和推导式,并且可能看起来更“Pythonic”。然而,我运行的快速计时测试表明该生成器更快一些:

_undef = object()

def itersingles(source):
    cur = _undef
    dup = True
    for elem in source:
        if dup:
            if elem != cur:
                cur = elem
                dup = False
        else:
            if elem == cur:
                dup = True
            else:
                yield cur
                cur = elem
    if not dup:
        yield cur

source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))

The other solutions are using various itertools helpers, and comprehensions, and probably look more "Pythonic". However, a quick timing test I ran showed this generator was a bit faster:

_undef = object()

def itersingles(source):
    cur = _undef
    dup = True
    for elem in source:
        if dup:
            if elem != cur:
                cur = elem
                dup = False
        else:
            if elem == cur:
                dup = True
            else:
                yield cur
                cur = elem
    if not dup:
        yield cur

source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文