迭代列表切片

发布于 2024-08-02 22:17:34 字数 745 浏览 9 评论 0原文

我想要一个算法来迭代列表切片。切片大小在函数外部设置并且可以不同。

在我看来,它是这样的:

for list_of_x_items in fatherList:
    foo(list_of_x_items)

有没有办法正确定义 list_of_x_items 或使用 python 2.5 执行此操作的其他方式?


编辑1:澄清“分区”和“滑动窗口”术语听起来都适用于我的任务,但我不是专家。所以我将更深入地解释这个问题并添加到问题中:

fatherList 是我从文件中获取的多级 numpy.array 。函数必须找到系列的平均值(用户提供系列的长度)为了求平均值,我使用 mean() 函数。现在进行问题扩展:

edit2:如何修改您提供的函数来存储额外的项目,并在下一个fatherList输入到该函数时使用它们?

例如,如果列表的长度为 10 并且块的大小为 3,则存储列表的第 10 个成员并将其附加到下一个列表的开头。


相关:

I want an algorithm to iterate over list slices. Slices size is set outside the function and can differ.

In my mind it is something like:

for list_of_x_items in fatherList:
    foo(list_of_x_items)

Is there a way to properly define list_of_x_items or some other way of doing this using python 2.5?


edit1: Clarification Both "partitioning" and "sliding window" terms sound applicable to my task, but I am no expert. So I will explain the problem a bit deeper and add to the question:

The fatherList is a multilevel numpy.array I am getting from a file. Function has to find averages of series (user provides the length of series) For averaging I am using the mean() function. Now for question expansion:

edit2: How to modify the function you have provided to store the extra items and use them when the next fatherList is fed to the function?

for example if the list is lenght 10 and size of a chunk is 3, then the 10th member of the list is stored and appended to the beginning of the next list.


Related:

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

樱花坊 2024-08-09 22:17:35

扩展@Ants Aasma的答案:在Python 3.7中,StopIteration异常的处理更改(根据 PEP-479< /a>)。兼容版本是:

from itertools import chain, islice

def ichunked(seq, chunksize):
    it = iter(seq)
    while True:
        try:
            yield chain([next(it)], islice(it, chunksize - 1))
        except StopIteration:
            return

Expanding on the answer of @Ants Aasma: In Python 3.7 the handling of the StopIteration exception changed (according to PEP-479). A compatible version would be:

from itertools import chain, islice

def ichunked(seq, chunksize):
    it = iter(seq)
    while True:
        try:
            yield chain([next(it)], islice(it, chunksize - 1))
        except StopIteration:
            return
总以为 2024-08-09 22:17:35

你的问题可以使用更多细节,但是怎么样:

def iterate_over_slices(the_list, slice_size):
    for start in range(0, len(the_list)-slice_size):
        slice = the_list[start:start+slice_size]
        foo(slice)

Your question could use some more detail, but how about:

def iterate_over_slices(the_list, slice_size):
    for start in range(0, len(the_list)-slice_size):
        slice = the_list[start:start+slice_size]
        foo(slice)
秋风の叶未落 2024-08-09 22:17:35

对于一个近乎一个的衬垫(在itertools导入之后),按照Nadia的答案处理非块可整除尺寸而无需填充:

>>> import itertools as itt
>>> chunksize = 5
>>> myseq = range(18)
>>> cnt = itt.count()
>>> print [ tuple(grp) for k,grp in itt.groupby(myseq, key=lambda x: cnt.next()//chunksize%2)]
[(0, 1, 2, 3, 4), (5, 6, 7, 8, 9), (10, 11, 12, 13, 14), (15, 16, 17)]

如果你愿意,你可以摆脱itertools.count () 要求使用 enumerate(),但比较丑陋:(

[ [e[1] for e in grp] for k,grp in itt.groupby(enumerate(myseq), key=lambda x: x[0]//chunksize%2) ]

在本例中,enumerate() 是多余的,但并非所有序列都是整齐的范围就像这样,显然)

远没有其他答案那么简洁,但在紧要关头很有用,特别是如果已经导入了itertools。

For a near-one liner (after itertools import) in the vein of Nadia's answer dealing with non-chunk divisible sizes without padding:

>>> import itertools as itt
>>> chunksize = 5
>>> myseq = range(18)
>>> cnt = itt.count()
>>> print [ tuple(grp) for k,grp in itt.groupby(myseq, key=lambda x: cnt.next()//chunksize%2)]
[(0, 1, 2, 3, 4), (5, 6, 7, 8, 9), (10, 11, 12, 13, 14), (15, 16, 17)]

If you want, you can get rid of the itertools.count() requirement using enumerate(), with a rather uglier:

[ [e[1] for e in grp] for k,grp in itt.groupby(enumerate(myseq), key=lambda x: x[0]//chunksize%2) ]

(In this example the enumerate() would be superfluous, but not all sequences are neat ranges like this, obviously)

Nowhere near as neat as some other answers, but useful in a pinch, especially if already importing itertools.

-柠檬树下少年和吉他 2024-08-09 22:17:35

将列表或迭代器分割成给定大小的块的函数。如果最后一个块较小,也可以正确处理这种情况:

def slice_iterator(data, slice_len):
    it = iter(data)
    while True:
        items = []
        for index in range(slice_len):
            try:
                item = next(it)
            except StopIteration:
                if items == []:
                    return # we are done
                else:
                    break # exits the "for" loop
            items.append(item)
        yield items

用法示例:

for slice in slice_iterator([1,2,3,4,5,6,7,8,9,10],3):
    print(slice)

结果:

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]

A function that slices a list or an iterator into chunks of a given size. Also handles the case correctly if the last chunk is smaller:

def slice_iterator(data, slice_len):
    it = iter(data)
    while True:
        items = []
        for index in range(slice_len):
            try:
                item = next(it)
            except StopIteration:
                if items == []:
                    return # we are done
                else:
                    break # exits the "for" loop
            items.append(item)
        yield items

Usage example:

for slice in slice_iterator([1,2,3,4,5,6,7,8,9,10],3):
    print(slice)

Result:

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]
人│生佛魔见 2024-08-09 22:17:34

如果你想将列表分成切片,你可以使用这个技巧:

list_of_slices = zip(*(iter(the_list),) * slice_size)

例如,

>>> zip(*(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

如果项目的数量不能被切片大小整除,并且你想用 None 填充列表,你可以这样做:

>>> map(None, *(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]

这是一个肮脏的小技巧


OK ,我将解释它是如何工作的。解释起来会很棘手,但我会尽力。

首先是一些背景知识:

在 Python 中,您可以将列表乘以数字,如下所示:

[1, 2, 3] * 3 -> [1, 2, 3, 1, 2, 3, 1, 2, 3]
([1, 2, 3],) * 3 -> ([1, 2, 3], [1, 2, 3], [1, 2, 3])

以及 迭代器 对象可以像这样使用一次:

>>> l=iter([1, 2, 3])
>>> l.next()
1
>>> l.next()
2
>>> l.next()
3

zip 函数返回一个元组列表,其中第 i 个元组包含每个参数序列或可迭代对象中的第 i 个元素。例如:

zip([1, 2, 3], [20, 30, 40]) -> [(1, 20), (2, 30), (3, 40)]
zip(*[(1, 20), (2, 30), (3, 40)]) -> [[1, 2, 3], [20, 30, 40]]

zip 前面的 * 用于解压参数。您可以在此处找到更多详细信息。
So

zip(*[(1, 20), (2, 30), (3, 40)])

实际上相当于

zip((1, 20), (2, 30), (3, 40))

但可以使用可变数量的参数

现在回到技巧:

list_of_slices = zip(*(iter(the_list),) * slice_size)

iter(the_list) ->将列表转换为迭代器

(iter(the_list),) * N ->将生成对 the_list 迭代器的 N 引用。

zip(*(iter(the_list),) * N) ->会将这些迭代器列表输入 zip 中。这又会将它们分组为 N 大小的元组。但由于所有 N 个项目实际上都是对同一个迭代器 iter(the_list) 的引用,因此结果将在原始迭代器上重复调用 next()

我希望能解释这一点。我建议您采用更容易理解的解决方案。我只是想提及这个技巧,因为我喜欢它。

If you want to divide a list into slices you can use this trick:

list_of_slices = zip(*(iter(the_list),) * slice_size)

For example

>>> zip(*(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

If the number of items is not dividable by the slice size and you want to pad the list with None you can do this:

>>> map(None, *(iter(range(10)),) * 3)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]

It is a dirty little trick


OK, I'll explain how it works. It'll be tricky to explain but I'll try my best.

First a little background:

In Python you can multiply a list by a number like this:

[1, 2, 3] * 3 -> [1, 2, 3, 1, 2, 3, 1, 2, 3]
([1, 2, 3],) * 3 -> ([1, 2, 3], [1, 2, 3], [1, 2, 3])

And an iterator object can be consumed once like this:

>>> l=iter([1, 2, 3])
>>> l.next()
1
>>> l.next()
2
>>> l.next()
3

The zip function returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. For example:

zip([1, 2, 3], [20, 30, 40]) -> [(1, 20), (2, 30), (3, 40)]
zip(*[(1, 20), (2, 30), (3, 40)]) -> [[1, 2, 3], [20, 30, 40]]

The * in front of zip used to unpack arguments. You can find more details here.
So

zip(*[(1, 20), (2, 30), (3, 40)])

is actually equivalent to

zip((1, 20), (2, 30), (3, 40))

but works with a variable number of arguments

Now back to the trick:

list_of_slices = zip(*(iter(the_list),) * slice_size)

iter(the_list) -> convert the list into an iterator

(iter(the_list),) * N -> will generate an N reference to the_list iterator.

zip(*(iter(the_list),) * N) -> will feed those list of iterators into zip. Which in turn will group them into N sized tuples. But since all N items are in fact references to the same iterator iter(the_list) the result will be repeated calls to next() on the original iterator

I hope that explains it. I advice you to go with an easier to understand solution. I was only tempted to mention this trick because I like it.

贪恋 2024-08-09 22:17:34

如果您希望能够使用任何可迭代对象,您可以使用以下函数:

from itertools import chain, islice

def ichunked(seq, chunksize):
    """Yields items from an iterator in iterable chunks."""
    it = iter(seq)
    while True:
        yield chain([it.next()], islice(it, chunksize-1))

def chunked(seq, chunksize):
    """Yields items from an iterator in list chunks."""
    for chunk in ichunked(seq, chunksize):
        yield list(chunk)

If you want to be able to consume any iterable you can use these functions:

from itertools import chain, islice

def ichunked(seq, chunksize):
    """Yields items from an iterator in iterable chunks."""
    it = iter(seq)
    while True:
        yield chain([it.next()], islice(it, chunksize-1))

def chunked(seq, chunksize):
    """Yields items from an iterator in list chunks."""
    for chunk in ichunked(seq, chunksize):
        yield list(chunk)
℡Ms空城旧梦 2024-08-09 22:17:34

使用生成器:

big_list = [1,2,3,4,5,6,7,8,9]
slice_length = 3
def sliceIterator(lst, sliceLen):
    for i in range(len(lst) - sliceLen + 1):
        yield lst[i:i + sliceLen]

for slice in sliceIterator(big_list, slice_length):
    foo(slice)

sliceIterator 在序列 lst 上实现宽度为 sliceLen 的“滑动窗口”,即它生成重叠切片:[1,2 ,3],[2,3,4],[3,4,5],...但不确定这是否是OP的意图。

Use a generator:

big_list = [1,2,3,4,5,6,7,8,9]
slice_length = 3
def sliceIterator(lst, sliceLen):
    for i in range(len(lst) - sliceLen + 1):
        yield lst[i:i + sliceLen]

for slice in sliceIterator(big_list, slice_length):
    foo(slice)

sliceIterator implements a "sliding window" of width sliceLen over the squence lst, i.e. it produces overlapping slices: [1,2,3], [2,3,4], [3,4,5], ... Not sure if that is the OP's intention, though.

小兔几 2024-08-09 22:17:34

您的意思是:

def callonslices(size, fatherList, foo):
  for i in xrange(0, len(fatherList), size):
    foo(fatherList[i:i+size])

如果这大致是您想要的功能,如果您愿意,您可以在生成器中对其进行一些修饰:

def sliceup(size, fatherList):
  for i in xrange(0, len(fatherList), size):
    yield fatherList[i:i+size]

然后:

def callonslices(size, fatherList, foo):
  for sli in sliceup(size, fatherList):
    foo(sli)

Do you mean something like:

def callonslices(size, fatherList, foo):
  for i in xrange(0, len(fatherList), size):
    foo(fatherList[i:i+size])

If this is roughly the functionality you want you might, if you desire, dress it up a bit in a generator:

def sliceup(size, fatherList):
  for i in xrange(0, len(fatherList), size):
    yield fatherList[i:i+size]

and then:

def callonslices(size, fatherList, foo):
  for sli in sliceup(size, fatherList):
    foo(sli)
病毒体 2024-08-09 22:17:34

回答问题的最后一部分:

问题更新:如何修改
您提供的存储函数
额外的物品并在以下情况下使用它们
下一个fatherList被馈送到
功能?

如果您需要存储状态,那么您可以使用对象来存储状态。

class Chunker(object):
    """Split `iterable` on evenly sized chunks.

    Leftovers are remembered and yielded at the next call.
    """
    def __init__(self, chunksize):
        assert chunksize > 0
        self.chunksize = chunksize        
        self.chunk = []

    def __call__(self, iterable):
        """Yield items from `iterable` `self.chunksize` at the time."""
        assert len(self.chunk) < self.chunksize
        for item in iterable:
            self.chunk.append(item)
            if len(self.chunk) == self.chunksize:
                # yield collected full chunk
                yield self.chunk
                self.chunk = [] 

示例:

chunker = Chunker(3)
for s in "abcd", "efgh":
    for chunk in chunker(s):
        print ''.join(chunk)

if chunker.chunk: # is there anything left?
    print ''.join(chunker.chunk)

输出:

abc
def
gh

Answer to the last part of the question:

question update: How to modify the
function you have provided to store
the extra items and use them when the
next fatherList is fed to the
function?

If you need to store state then you can use an object for that.

class Chunker(object):
    """Split `iterable` on evenly sized chunks.

    Leftovers are remembered and yielded at the next call.
    """
    def __init__(self, chunksize):
        assert chunksize > 0
        self.chunksize = chunksize        
        self.chunk = []

    def __call__(self, iterable):
        """Yield items from `iterable` `self.chunksize` at the time."""
        assert len(self.chunk) < self.chunksize
        for item in iterable:
            self.chunk.append(item)
            if len(self.chunk) == self.chunksize:
                # yield collected full chunk
                yield self.chunk
                self.chunk = [] 

Example:

chunker = Chunker(3)
for s in "abcd", "efgh":
    for chunk in chunker(s):
        print ''.join(chunk)

if chunker.chunk: # is there anything left?
    print ''.join(chunker.chunk)

Output:

abc
def
gh
甜点 2024-08-09 22:17:34

我不确定,但你似乎想做所谓的移动平均线。 numpy 为此提供了工具(卷积函数)。

>>> x = numpy.array(range(20))
>>> x
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])    
>>> n = 2 # moving average window
>>> numpy.convolve(numpy.ones(n)/n, x)[n-1:-n+1]
array([  0.5,   1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,
         9.5,  10.5,  11.5,  12.5,  13.5,  14.5,  15.5,  16.5,  17.5,  18.5])

好处是它可以很好地适应不同的权重方案(只需将 numpy.ones(n) / n 更改为其他内容即可)。

您可以在这里找到完整的材料:
http://www.scipy.org/Cookbook/SignalSmooth

I am not sure, but it seems you want to do what is called a moving average. numpy provides facilities for this (the convolve function).

>>> x = numpy.array(range(20))
>>> x
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])    
>>> n = 2 # moving average window
>>> numpy.convolve(numpy.ones(n)/n, x)[n-1:-n+1]
array([  0.5,   1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,
         9.5,  10.5,  11.5,  12.5,  13.5,  14.5,  15.5,  16.5,  17.5,  18.5])

The nice thing is that it accomodates different weighting schemes nicely (just change numpy.ones(n) / n to something else).

You can find a complete material here:
http://www.scipy.org/Cookbook/SignalSmooth

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文