itertools 中的 izip_longest:迭代器内的 IndexError 是如何工作的?
def izip_longest_from_docs(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
yield counter() # yields the fillvalue, or raises IndexError
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
当我试图理解它是如何工作的时,我偶然发现了这个问题: “如果在作为参数发送到 izip_longest
的迭代器之一内引发 IndexError
会怎样?”。
然后我编写了一些测试代码:
from itertools import izip_longest, repeat, chain, izip
def izip_longest_from_docs(*args, **kwds):
# The code is exactly the same as shown above
....
def gen1():
for i in range(5):
yield i
def gen2():
for i in range(10):
if i==8:
raise IndexError #simulation IndexError raised inside the iterator
yield i
for i in izip_longest_from_docs(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
print('\n')
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
结果发现,itertools 模块中的函数和 izip_longest_from_docs 中的函数工作方式不同。
上面代码的输出:
>>>
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
Traceback (most recent call last):
File "C:/..., line 31, in <module>
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
File "C:/... test_IndexError_inside iterator.py", line 23, in gen2
raise IndexError
IndexError
所以,很明显,来自 itertools
的 izip_longes
代码确实传播了 IndexError
异常(我认为它应该),但是 izip_longes_from_docs
'吞掉了' IndexError
异常,因为它将它作为来自 sentinel
的信号来停止迭代。
我的问题是,他们如何解决 itertools
模块中代码中的 IndexError
传播问题?
In this question @lazyr asks how the following code of izip_longest
iterator from here works:
def izip_longest_from_docs(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
yield counter() # yields the fillvalue, or raises IndexError
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
When I was trying to understand how it works I stumbled into the question:
"What if IndexError
is raised inside one of those iterators that are sent to izip_longest
as parameters?".
Then I wrote some testing code:
from itertools import izip_longest, repeat, chain, izip
def izip_longest_from_docs(*args, **kwds):
# The code is exactly the same as shown above
....
def gen1():
for i in range(5):
yield i
def gen2():
for i in range(10):
if i==8:
raise IndexError #simulation IndexError raised inside the iterator
yield i
for i in izip_longest_from_docs(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
print('\n')
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
print('{i[0]} {i[1]}'.format(**locals()))
And it turned out that the function in itertools
module and izip_longest_from_docs
work differently.
The output of the code above:
>>>
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
0 0
1 1
2 2
3 3
4 4
- 5
- 6
- 7
Traceback (most recent call last):
File "C:/..., line 31, in <module>
for i in izip_longest(gen1(),gen2(), fillvalue = '-'):
File "C:/... test_IndexError_inside iterator.py", line 23, in gen2
raise IndexError
IndexError
So, it's clearly seen, that the code of izip_longes
from itertools
did propagate IndexError
exception (as I think it should), but izip_longes_from_docs
'swallowed' IndexError
exception as it took it as a signal from sentinel
to stop iterating.
My question is, how did they worked around IndexError
propagation in the code in theitertools
module?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在 代码中的
izip_longest_next
中>izip_longest,不使用哨兵。相反,CPython 使用计数器跟踪仍有多少迭代器处于活动状态,并在活动数量达到零时停止。
如果发生错误,它将结束迭代,就好像没有迭代器仍然处于活动状态一样,并允许错误传播。
代码:
我看到的最简单的解决方案:
in
izip_longest_next
in the code ofizip_longest
, no sentinel is used.Instead, CPython keeps track of how many of the iterators are still active with a counter, and stops when the number active reaches zero.
If an error occurs, it ends iteration as if there are no iterators still active, and allows the error to propagate.
The code:
The simplest solution I see: