from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
chain((None,), li),
li) if x != y != z]
timeit 测试表明,它的速度大约是 Karl 的答案或我针对短组的最快 groupby 版本的两倍。
如果您的列表可以包含 None,请确保使用 None 以外的值(例如 object())。
如果您需要它处理不是序列的迭代器/可迭代对象,或者您的组很长,请使用此版本:
[key for key, group in groupby(li)
if (next(group) or True) and next(group, None) is None]
timeit 显示对于 1,000 个项目组,它比其他版本快大约十倍。
早期、缓慢的版本:
[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]
Here is a version based on Karl's answer which doesn't requires copies of the list (tmp, the slices, and the zipped list). izip is significantly faster than (Python 2) zip for large lists. chain is slightly slower than slicing but doesn't require a tmp object or copies of the list. islice plus making a tmp is a bit faster, but requires more memory and is less elegant.
from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
chain((None,), li),
li) if x != y != z]
A timeit test shows it to be approximately twice as fast as Karl's answer or my fastest groupby version for short groups.
Make sure to use a value other than None (like object()) if your list can contain Nones.
Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:
[key for key, group in groupby(li)
if (next(group) or True) and next(group, None) is None]
timeit shows it's about ten times faster than the other version for 1,000 item groups.
Earlier, slow versions:
[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]
[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]
agf's answer is good if the size of the groups is small, but if there are enough duplicates in a row, it will be more efficient not to "sum 1" over those groups
[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]
_undef = object()
def itersingles(source):
cur = _undef
dup = True
for elem in source:
if dup:
if elem != cur:
cur = elem
dup = False
else:
if elem == cur:
dup = True
else:
yield cur
cur = elem
if not dup:
yield cur
source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))
The other solutions are using various itertools helpers, and comprehensions, and probably look more "Pythonic". However, a quick timing test I ran showed this generator was a bit faster:
_undef = object()
def itersingles(source):
cur = _undef
dup = True
for elem in source:
if dup:
if elem != cur:
cur = elem
dup = False
else:
if elem == cur:
dup = True
else:
yield cur
cur = elem
if not dup:
yield cur
source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))
发布评论
评论(4)
这是基于卡尔的答案的版本 不需要列表的副本(
tmp
、切片和压缩列表)。对于大型列表,izip
比 (Python 2)zip
快得多。chain
比切片稍慢,但不需要tmp
对象或列表的副本。islice
加上制作tmp
速度更快一点,但需要更多内存并且不太优雅。timeit
测试表明,它的速度大约是 Karl 的答案或我针对短组的最快groupby
版本的两倍。如果您的列表可以包含
None
,请确保使用None
以外的值(例如object()
)。如果您需要它处理不是序列的迭代器/可迭代对象,或者您的组很长,请使用此版本:
timeit
显示对于 1,000 个项目组,它比其他版本快大约十倍。早期、缓慢的版本:
Here is a version based on Karl's answer which doesn't requires copies of the list (
tmp
, the slices, and the zipped list).izip
is significantly faster than (Python 2)zip
for large lists.chain
is slightly slower than slicing but doesn't require atmp
object or copies of the list.islice
plus making atmp
is a bit faster, but requires more memory and is less elegant.A
timeit
test shows it to be approximately twice as fast as Karl's answer or my fastestgroupby
version for short groups.Make sure to use a value other than
None
(likeobject()
) if your list can containNone
s.Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:
timeit
shows it's about ten times faster than the other version for 1,000 item groups.Earlier, slow versions:
agf的答案很好如果组的大小很小,但如果一行中有足够的重复项,则不对这些组“求和 1”会更有效
agf's answer is good if the size of the groups is small, but if there are enough duplicates in a row, it will be more efficient not to "sum 1" over those groups
其他解决方案使用各种 itertools 帮助程序和推导式,并且可能看起来更“Pythonic”。然而,我运行的快速计时测试表明该生成器更快一些:
The other solutions are using various itertools helpers, and comprehensions, and probably look more "Pythonic". However, a quick timing test I ran showed this generator was a bit faster: