如何克隆生成器对象?
考虑这种情况:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
walk = os.walk('/home')
for root, dirs, files in walk:
for pathname in dirs+files:
print os.path.join(root, pathname)
for root, dirs, files in walk:
for pathname in dirs+files:
print os.path.join(root, pathname)
我们需要多次使用相同的 walk
数据。我有一个基准场景,并且必须使用相同的walk
数据才能获得有用的结果。
我尝试使用 walk2 = walk
克隆并在第二次迭代中使用,但没有成功。我怎样才能复制它?有可能吗?
Consider this scenario:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
walk = os.walk('/home')
for root, dirs, files in walk:
for pathname in dirs+files:
print os.path.join(root, pathname)
for root, dirs, files in walk:
for pathname in dirs+files:
print os.path.join(root, pathname)
We need to use the same walk
data more than once. I've a benchmark scenario and the use of same walk
data is mandatory to get helpful results.
I've tried walk2 = walk
to clone and use in the second iteration, but it didn't work. How can I copy it? Is it ever possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您可以使用
itertools.tee()
:请注意,这可能正如文档指出的那样,“需要大量额外的存储空间”。
You can use
itertools.tee()
:Note that this might "need significant extra storage", as the documentation points out.
如果您知道每次使用都会迭代整个生成器,那么通过将生成器展开到列表并多次使用该列表,您可能会获得最佳性能。
walk = list(os.walk('/home'))
If you know you are going to iterate through the whole generator for every usage, you will probably get the best performance by unrolling the generator to a list and using the list multiple times.
walk = list(os.walk('/home'))
定义一个函数
或者甚至这
两者都像这样使用:
Define a function
Or even this
Both are used like this:
这是
functools.partial()
制作一个快速的生成器工厂:
functools.partial() 的作用很难用人类语言来描述,但这就是它的用途。
它部分填充函数参数,但不执行该函数。因此,它充当函数/生成器工厂。
This is a good usecase for
functools.partial()
to make a quick generator-factory:
What
functools.partial()
does is hard to describe with human-words, but this^ is what it's for.It partially fills out function-params without executing that function. Consequently it acts as a function/generator factory.
这个答案旨在扩展/阐述其他答案所表达的内容。解决方案必然会根据您想要实现的目标而有所不同。
如果您想多次迭代
os.walk
的完全相同的结果,则需要从os.walk
可迭代项初始化一个列表(即walk = list(os.walk(path))
).如果您必须保证数据保持不变,这可能是您唯一的选择。然而,在某些情况下这是不可能或不可取的。
list()
迭代(即尝试list()
整个文件系统可能会冻结您的计算机)。list()
是不可取的。如果
list()
不适合,您将需要按需运行生成器。请注意,发电机在每次使用后都会熄灭,因此这会带来一个小问题。为了多次“重新运行”生成器,您可以使用以下模式:上述设计模式将允许您保持代码干燥。
This answer aims to extend/elaborate on what the other answers have expressed. The solution will necessarily vary depending on what exactly you aim to achieve.
If you want to iterate over the exact same result of
os.walk
multiple times, you will need to initialize a list from theos.walk
iterable's items (i.e.walk = list(os.walk(path))
).If you must guarantee the data remains the same, that is probably your only option. However, there are several scenarios in which this is not possible or desirable.
list()
an iterable if the output is of sufficient size (i.e. attempting tolist()
an entire filesystem may freeze your computer).list()
an iterable if you wish to acquire "fresh" data prior to each use.In the event that
list()
is not suitable, you will need to run your generator on demand. Note that generators are extinguised after each use, so this poses a slight problem. In order to "rerun" your generator multiple times, you can use the following pattern:The aforementioned design pattern will allow you to keep your code DRY.
这个“Python 生成器侦听器”代码允许您在单个生成器上拥有多个侦听器,例如 os.walk,甚至可以让某人稍后“插话”。
def walkme():
os.walk('/home')
m1 = Muxer(walkme)
m2 = Muxer(walkme)
那么 m1 和 m2 甚至可以在线程中运行并在空闲时进行处理。
请参阅:https://gist.github.com/earonesty/cafa4626a2def6766acf5098331157b3
This "Python Generator Listeners" code allows you to have many listeners on a single generator, like
os.walk
, and even have someone "chime in" later.def walkme():
os.walk('/home')
m1 = Muxer(walkme)
m2 = Muxer(walkme)
then m1 and m2 can run in threads even and process at their leisure.
See: https://gist.github.com/earonesty/cafa4626a2def6766acf5098331157b3