检测构建独立迭代器的最便宜方法
假设我正在编写一个接受可迭代对象的函数,并且我的函数希望不知道该可迭代对象是否实际上是迭代器。
(这是一种常见的情况,对吧?我认为基本上所有的 itertools 函数都是这样编写的。接受一个可迭代对象,返回一个迭代器。)
例如,如果我调用 itertools.tee(•, 2)< /code> 在一个对象上,而它恰好还不是一个迭代器,这可能意味着只需对其调用两次
iter
以获得两个独立的迭代器会更便宜。 itertools 函数是否足够聪明,能够知道这一点,如果没有,避免这种方式不必要的成本的最佳方法是什么?
Suppose I'm writing a function taking in an iterable, and my function wants to be agnostic as to whether that iterable is actually an iterator yet or not.
(This is a common situation, right? I think basically all the itertools functions are written this way. Take in an iterable, return an iterator.)
If I call, for instance, itertools.tee(•, 2)
on an object, and it happens to not be an iterator yet, that presumably means it would be cheaper just to call iter
on it twice to get my two independent iterators. Are itertools functions smart enough to know this, and if not, what's the best way to avoid unnecessary costs in this way?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
观察:
因此您无需担心函数的参数是可迭代的还是已经是迭代器了。您可以对已经是迭代器的对象调用方法
__iter__
,在这种情况下它只返回self
。这不是一个昂贵的调用,并且比您可以用来测试它是否是迭代器的任何事情都要便宜,例如它是否有 __next__ 方法(以及如果没有的话,无论如何都必须调用 __iter__ )。更新
现在,我们发现,自从调用 <在前者上调用两次
iter
会给你两个不同的迭代器,而在后者上调用两次iter
则不会。例如,itertools.tee 需要一个可迭代的对象。如果你向它传递一个实现 __iter__ 并返回 'self的迭代器,它显然会起作用,因为
tee` 不需要两个独立的迭代器来发挥其魔力。但是,如果您正在编写一个传递一个可迭代的迭代器,该迭代器是通过在传递的迭代器上内部使用两个或多个迭代器来实现的,那么您真正想要的是什么测试的是正在传递的内容是否支持多个、并发、独立的迭代,无论它是迭代器还是普通的迭代器:
打印:
The writer of 原件,已通过迭代器必须以支持多个、并发、独立迭代的方式编写。
Observe:
So you do not need to worry whether the argument to your function is an iterable or already an iterator. You can call method
__iter__
on something that is already an iterator and it just returnsself
in that case. This is not an expensive call and would be cheaper than anything you could possibly do to test to see if it is an iterator, such as whether it has a__next__
method (and then having to call__iter__
on it anyway if it doesn't).Update
We now see that there is a bit difference in passing to your function an iterable vs passing an iterator (depending on how the iterator is written, of course) since calling
iter
twice on the former will give you two distinct iterators while callingiter
twice on the latter will not.itertools.tee
, as an example, is expecting an iterable. If you pass it an iterator that implements__iter__
that returns 'selfit will clearly work since
tee` does not need two independent iterators for it to do its magic.But if you are writing an iterator that is passed an iterable that is implemented by internally using two or more iterators on the passed iterator, what you really want to be testing for is whether what is being passed is something that support multiple, concurrent, independent iterations regardless of whether it is an iterator or just a plain iterator:
Prints:
The writer of the original, passed iterator must write it in such a way that it supports multiple, concurrent, independent iterations.