itertools 中的 izip_longest:这是怎么回事?

发布于 2024-10-22 03:28:01 字数 745 浏览 6 评论 0原文

我很难理解下面的代码是如何工作的。它来自 http://docs.python.org/library/itertools.html#itertools .izip_longest,是 izip_longest 迭代器的纯 Python 等效项。我对哨兵功能尤其感到困惑,它是如何工作的?

def izip_longest(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
        yield counter()         # yields the fillvalue, or raises IndexError
    fillers = repeat(fillvalue)
    iters = [chain(it, sentinel(), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except IndexError:
        pass

I'm struggeling to understand how the below code works. It's from http://docs.python.org/library/itertools.html#itertools.izip_longest, and is the pure-python equivalent of the izip_longest iterator. I'm especially mystified by the sentinel function, how does it work?

def izip_longest(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
        yield counter()         # yields the fillvalue, or raises IndexError
    fillers = repeat(fillvalue)
    iters = [chain(it, sentinel(), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except IndexError:
        pass

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

空气里的味道 2024-10-29 03:28:01

好的,我们可以做到这一点。关于哨兵。表达式 ([fillvalue]*(len(args)-1)) 创建一个列表,其中包含 args 中每个可迭代对象的一个​​填充值(减一)。因此,对于上面的示例['-']。然后为 counter 分配 pop-该列表的功能。 sentinel 本身是一个生成器,弹出一个项目从每次迭代的列表中。您可以仅对 sentinel 返回的每个迭代器进行一次迭代,并且它始终会产生 fillvalue。由 sentinel 返回的所有迭代器产生的项目总数为 len(args) - 1 (感谢 Sven Marnach 的澄清,我误解了它)。

现在看看这个:

iters = [chain(it, sentinel(), fillers) for it in args]

这就是窍门。 iters 是一个列表,其中包含 args 中每个可迭代对象的迭代器。这些迭代器中的每一个都执行以下操作:

  1. 迭代 args 中相应可迭代对象中的所有项目。
  2. 迭代哨兵一次,产生 fillvalue
  3. 永远重复fillvalue

现在,正如之前所建立的,我们只能在抛出 IndexError 之前将所有哨兵一起迭代 len(args)-1 次。这很好,因为其中一个可迭代对象是最长的。因此,当我们遇到 IndexError 引发的情况时,这意味着我们已经完成了对 args 中最长可迭代对象的迭代。

不客气。

PS:我希望这是可以理解的。

Ok, we can do this. About the sentinel. The expression ([fillvalue]*(len(args)-1)) creates a list that contains one fill value for each iterable in args, minus one. So, for the example above ['-']. counter is then assigned the pop-function of that list. sentinel itself is a generator that pops one item from that list on each iteration. You can iterate over each iterator returned by sentinel exactly once, and it will always yield fillvalue. The total number of items yielded by all iterators returned by sentinel is len(args) - 1 (thanks to Sven Marnach for clarifying that, I misunderstood it).

Now check out this:

iters = [chain(it, sentinel(), fillers) for it in args]

That is the trick. iters is a list that contains an iterator for each iterable in args. Each of these iterators does the following:

  1. Iterate over all items in the corresponding iterable from args.
  2. Iterate over sentinel once, yielding fillvalue.
  3. Repeat fillvalue for all eternity.

Now, as established earlier, we can only iterate over all sentinels together len(args)-1 times before it throws an IndexError. This is fine, because one of the iterables is the longest. So, when we come to the point that the IndexError is raised, that means we have finished iterating over the longest iterable in args.

You are welcome.

P.S.: I hope this is understandable.

撩心不撩汉 2024-10-29 03:28:01

函数sentinel()返回迭代器,仅产生一次fillvaluesentinel() 返回的所有迭代器生成的 fillvalue 总数限制为 n-1,其中 n code> 是传递给 izip_longest() 的迭代器数量。在耗尽此数量的 fillvalue 后,对由 sentinel() 返回的迭代器进行进一步迭代将引发 IndexError

该函数用于检测是否所有迭代器都已耗尽:每个迭代器都使用 sentinel() 返回的迭代器进行 chain() 操作。如果所有迭代器都已耗尽,sentinel() 返回的迭代器将被迭代 n 次,导致 IndexError,触发依次结束izip_longest()

到目前为止,我解释了 sentinel() 的作用,而不是它的工作原理。当调用 izip_longest() 时,将评估 sentinel() 的定义。在评估定义时,每次调用 izip_longest() 时,也会评估 sentinel() 的默认参数。该代码相当于将

fillvalue_list = [fillvalue] * (len(args)-1)
def sentinel():
    yield fillvalue_list.pop()

其存储在默认参数中而不是存储在封闭范围内的变量中只是一种优化,就像在默认参数中包含 .pop 一样,因为它可以节省查找时间每次迭代 sentinel() 返回的迭代器时。

The function sentinel() returns iterators yielding fillvalue exactly once. The total number of fillvalues yielded by all iterators returned by sentinel() is limited to n-1, where n is the number of iterators passed to izip_longest(). After this number of fillvalues has been exhausted, further iteration over an iterator returned by sentinel() will raise an IndexError.

This function is used to detect whether all iterators have been exhausted: Each iterator is chain()ed with an iterator returned by sentinel(). If all iterators are exhausted, an iterator returned by sentinel() will get iterated over for the nth time, resulting in an IndexError, triggering the end of izip_longest() in turn.

So far I explained what sentinel() does, not how it works. When izip_longest() is called, the definition of sentinel() is evaluated. While evaluating the definition, also the default argument to sentinel() is evaluated, once per call to izip_longest(). The code is equivalent to

fillvalue_list = [fillvalue] * (len(args)-1)
def sentinel():
    yield fillvalue_list.pop()

Storing this in a default argument instead of in a variable in an enclosing scope is just an optimisation, as is the inclusion of .pop in the default argument, since it save looking it up every time an iterator returned by sentinel() is iterated over.

铜锣湾横着走 2024-10-29 03:28:01

sentinel 的定义几乎与 相同,

def sentinel():
    yield ([fillvalue] * (len(args) - 1)).pop()

只是它获取 pop 绑定方法(函数对象)作为默认参数。默认参数在函数定义时计算,因此每次调用 izip_longest 一次,而不是每次调用 Sentinel 一次。因此,函数对象“记住”列表[fillvalue] * (len(args) - 1),而不是在每次调用中重新构造它。

The definition of sentinel is almost equivalent to

def sentinel():
    yield ([fillvalue] * (len(args) - 1)).pop()

except that it gets the pop bound method (a function object) as a default argument. A default argument is evaluated at the time of function definition, so once per call to izip_longest instead of once per call to sentinel. Therefore, the function object "remembers" the list [fillvalue] * (len(args) - 1) instead of constructing this anew in every call.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文