python列表中的不连续切片

发布于 2024-11-03 12:03:17 字数 372 浏览 1 评论 0原文

我正在寻找一种有效的方法来实现这一点，我认为这是类似切片的操作：

>>> mylist = range(100)
>>>magicslicer(mylist, 10, 20)
[0,1,2,3,4,5,6,7,8,9,30,31,32,33,34,35,36,37,38,39,60,61,62,63......,97,98,99]

想法是：切片获取 10 个元素，然后跳过 20 个元素，然后获取下一个 10 个元素，然后跳过接下来的 20 个，依此类推。

我认为如果可能的话我不应该使用循环，因为使用切片的原因（我猜）是为了在单个操作中有效地进行“提取”。

感谢您的阅读。

原文

I'm looking for an efficient way of achieving this, which I think is a slicing-like operation:

>>> mylist = range(100)
>>>magicslicer(mylist, 10, 20)
[0,1,2,3,4,5,6,7,8,9,30,31,32,33,34,35,36,37,38,39,60,61,62,63......,97,98,99]

the idea is: the slicing gets 10 elements, then skips 20 elements, then gets next 10, then skips next 20, and so on.

I think I should not use loops if possible, for the very reason to use slice is (I guess) to do the "extraction" efficiently in a single operation.

Thanks for reading.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你的呼吸 2024-11-10 12:03:18

>>>[mylist[start:start+10] for start in mylist[::30]]
>>>[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]

但我得到了一个列表:(

>>>[mylist[start:start+10] for start in mylist[::30]]
>>>[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]

but i obtain a list of list :(

回复收藏 0 原文

薔薇婲 2024-11-10 12:03:18

mylist = range(100)

otherlist = ['21','31','689','777','479','51','71','yut','poi','ger',
             '11','61','789','zozozozo','8888','1']



def magic_slicer(iterable,keep,throw):
        it = iter(iterable).next
        for n in xrange((len(iterable)//keep+throw)+1):
                for i in xrange(keep):  yield it()
                for i in xrange(throw):  it()

print list(magic_slicer(mylist,10,20))
print
print list(magic_slicer(otherlist,2,3))


print '__________________'


def magic_slicer2(iterable,keep,throw):
        return ( x for i,x in enumerate(iterable) if -1< i%(keep+throw)<keep) 

print list(magic_slicer2(mylist,10,20))
print
print list(magic_slicer2(otherlist,2,3))

结果

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']
__________________
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']

mylist = range(100)

otherlist = ['21','31','689','777','479','51','71','yut','poi','ger',
             '11','61','789','zozozozo','8888','1']



def magic_slicer(iterable,keep,throw):
        it = iter(iterable).next
        for n in xrange((len(iterable)//keep+throw)+1):
                for i in xrange(keep):  yield it()
                for i in xrange(throw):  it()

print list(magic_slicer(mylist,10,20))
print
print list(magic_slicer(otherlist,2,3))


print '__________________'


def magic_slicer2(iterable,keep,throw):
        return ( x for i,x in enumerate(iterable) if -1< i%(keep+throw)<keep) 

print list(magic_slicer2(mylist,10,20))
print
print list(magic_slicer2(otherlist,2,3))

result

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']
__________________
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

['21', '31', '51', '71', '11', '61', '1']

回复收藏 0 原文

那小子欠揍 2024-11-10 12:03:18

[x for x in range(100) if x%30 < 10] 是另一种方法。但是，随着列表大小的增长，这可能会很慢。

同一行上的函数

def magic_slice(n, no_elems, step):
    s = no_elems + step
    return [x for x in range(n) if x%s < no_elems]

[x for x in range(100) if x%30 < 10] is another way to do it. But, this can be slow as the list size grows.

A function on the same lines

def magic_slice(n, no_elems, step):
    s = no_elems + step
    return [x for x in range(n) if x%s < no_elems]

回复收藏 0 原文

我还不会笑 2024-11-10 12:03:17

itertools.compress（2.7/3.1 中的新功能）很好地支持这样的用例，特别是与 itertools.cycle 结合使用时：

from itertools import cycle, compress
seq = range(100)
criteria = cycle([True]*10 + [False]*20) # Use whatever pattern you like
>>> list(compress(seq, criteria))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

Python 2.7 计时（相对于 Sven 的显式列表理解））：

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 4.96 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.76 usec per loop

Python 3.2 计时（也相对于 Sven 的显式列表理解）：

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 7.41 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.78 usec per loop

可以看出，相对于 2.7 中的内联列表理解，它并没有太大区别，但通过避免开销，在 3.2 中提供了显着帮助隐式嵌套范围。

如果目标是迭代结果序列而不是将其转换为完全实现的列表，则在 2.7 中也可以看到类似的差异：

$ ./python -m timeit -s "a = range(100)" "for x in (x for start in range(0, len(a), 30) for x in a[start:start+10]): pass"
100000 loops, best of 3: 6.82 usec per loop
$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "for x in compress(a, criteria): pass"
100000 loops, best of 3: 3.61 usec per loop

对于特别长的模式，可以将模式表达式中的列表替换为 < code>chain(repeat(True, 10), Repeat(False, 20)) 这样就不必在内存中完全创建它。

itertools.compress (new in 2.7/3.1) nicely supports use cases like this one, especially when combined with itertools.cycle:

from itertools import cycle, compress
seq = range(100)
criteria = cycle([True]*10 + [False]*20) # Use whatever pattern you like
>>> list(compress(seq, criteria))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

Python 2.7 timing (relative to Sven's explicit list comprehension):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 4.96 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.76 usec per loop

Python 3.2 timing (also relative to Sven's explicit list comprehension):

$ ./python -m timeit -s "a = range(100)" "[x for start in range(0, len(a), 30) for x in a[start:start+10]]"
100000 loops, best of 3: 7.41 usec per loop

$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "list(compress(a, criteria))"
100000 loops, best of 3: 4.78 usec per loop

As can be seen, it doesn't make a great deal of difference relative to the in-line list comprehension in 2.7, but helps significantly in 3.2 by avoiding the overhead of the implicit nested scope.

A similar difference can also be seen in 2.7 if the aim is to iterate over the resulting sequence rather than turn it into a fully realised list:

$ ./python -m timeit -s "a = range(100)" "for x in (x for start in range(0, len(a), 30) for x in a[start:start+10]): pass"
100000 loops, best of 3: 6.82 usec per loop
$ ./python -m timeit -s "from itertools import cycle, compress" -s "a = range(100)" -s "criteria = cycle([True]*10 + [False]*20)" "for x in compress(a, criteria): pass"
100000 loops, best of 3: 3.61 usec per loop

For especially long patterns, it is possible to replace the list in the pattern expression with an expression like chain(repeat(True, 10), repeat(False, 20)) so that it never has to be fully created in memory.

回复收藏 0 原文

泛滥成性 2024-11-10 12:03:17

也许最好的方法是直接的方法：

def magicslicer(seq, take, skip):
    return [x for start in range(0, len(seq), take + skip)
              for x in seq[start:start + take]]

我认为你无法避免循环。

编辑：由于它被标记为“性能”，因此这里与 a = range(100) 的模解进行比较：

In [2]: %timeit [x for start in range(0, len(a), 30)
                   for x in a[start:start + 10]]
100000 loops, best of 3: 4.89 us per loop

In [3]: %timeit [e for i, e in enumerate(a) if i % 30 < 10]
100000 loops, best of 3: 14.8 us per loop

Maybe the best way is the straight-forward approach:

def magicslicer(seq, take, skip):
    return [x for start in range(0, len(seq), take + skip)
              for x in seq[start:start + take]]

I don't think you can avoid the loops.

Edit: Since this is tagged "performance", here a comparison with the modulo solution for a = range(100):

In [2]: %timeit [x for start in range(0, len(a), 30)
                   for x in a[start:start + 10]]
100000 loops, best of 3: 4.89 us per loop

In [3]: %timeit [e for i, e in enumerate(a) if i % 30 < 10]
100000 loops, best of 3: 14.8 us per loop

回复收藏 0 原文

怪我鬧 2024-11-10 12:03:17

不幸的是，我认为切片无法做到这一点。我会使用列表推导式解决问题

>>> a = range(100)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    ...
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
>>> [e for i, e in enumerate(a) if i % 30 < 10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

I think that slices cannot do it, unfortunately. I'd solve the problem using list comprehensions

>>> a = range(100)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
    ...
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
>>> [e for i, e in enumerate(a) if i % 30 < 10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

回复收藏 0 原文

扮仙女 2024-11-10 12:03:17

我不知道你是否只使用数字，但如果你坚持使用 numpy，有一种更快的方法。但是，只有当您的列表由展平的相同长度的子列表组成时，以下内容才有效。

用于比较：

import numpy as np
from itertools import cycle, compress

startList = list(range(0, 3000))
startNpArray = np.linspace(0,2999,3000,dtype=np.int)

def WithNumpy(seq, keep, skip):
    return seq.reshape((-1, keep+skip))[:,:keep+1].flatten()

def WithItertools(seq, keep, skip):
    criteria = cycle([True]*keep + [False]* skip)
    return list(compress(seq, criteria))

%timeit WithNumpy(startListNp, 10, 20)
>>> 2.59 µs ± 48.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit WithItertools(startList, 10, 20)
>>> 33.5 µs ± 911 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I don't know if you are working with numbers only, but in case you are there is a faster way if you stick to numpy. But the following will only work if you have list that consists of sublists of equal length that were flattened out.

For comparison:

import numpy as np
from itertools import cycle, compress

startList = list(range(0, 3000))
startNpArray = np.linspace(0,2999,3000,dtype=np.int)

def WithNumpy(seq, keep, skip):
    return seq.reshape((-1, keep+skip))[:,:keep+1].flatten()

def WithItertools(seq, keep, skip):
    criteria = cycle([True]*keep + [False]* skip)
    return list(compress(seq, criteria))

%timeit WithNumpy(startListNp, 10, 20)
>>> 2.59 µs ± 48.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit WithItertools(startList, 10, 20)
>>> 33.5 µs ± 911 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

回复收藏 0 原文

素年丶 2024-11-10 12:03:17

我会使用一个循环：

#!/usr/bin/env python


def magicslicer(l, stepsize, stepgap):
    output = []
    i = 0
    while i<len(l):
        output += l[i:i+stepsize]
        i += stepsize + stepgap
    return output


mylist = range(100)
print magicslicer(mylist,10,20)

I'd use a loop:

#!/usr/bin/env python


def magicslicer(l, stepsize, stepgap):
    output = []
    i = 0
    while i<len(l):
        output += l[i:i+stepsize]
        i += stepsize + stepgap
    return output


mylist = range(100)
print magicslicer(mylist,10,20)

回复收藏 0 原文

~没有更多了~