Python 生成器与回调函数
我有一个类使用递归回溯算法解决精确覆盖问题。最初,我使用在初始化期间传递给对象的回调函数来实现该类。每当找到解决方案时就会调用此回调。在查看其他人对同一问题的实现时,我发现他们使用yield语句来传递解决方案,换句话说,他们的代码是一个python生成器。我认为这是一个有趣的想法,所以我制作了一个新版本的类来使用产量。然后我对两个版本进行了比较测试,令我惊讶的是,我发现生成器版本的运行速度比回调版本慢 5 倍。请注意,除了切换回调的收益之外,代码是相同的。
这是怎么回事?我推测,因为生成器需要在生成之前保存状态信息,然后在下一次调用时重新启动时恢复该状态,所以正是这种保存/恢复使得生成器版本运行速度如此之慢。如果是这种情况,生成器必须保存和恢复多少状态信息?
python 专家有什么想法吗?
-- 已编辑 7:40 PDT
这是使用产量的求解器代码。将下面的第一个yield 替换为对回调函数的调用,并将以下循环与第二个yield 更改为递归调用以求解此代码的原始版本。
def solve(self):
for tp in self.pieces:
if self.inuse[tp.name]: continue
self.inuse[tp.name] = True
while tp.next_orientation() is not None:
if tp.insert_piece():
self.n_trials += 1
self.pieces_in += 1
self.free_cells -= tp.size
if self.pieces_in == len(self.pieces) or self.free_cells == 0:
self.solutions += 1
self.haveSolution = True
yield True
self.haveSolution = False
else:
self.table.next_base_square()
for tf in self.solve():
yield tf
tp.remove_piece()
self.pieces_in -= 1
self.table.set_base_square(tp.base_square)
self.free_cells += tp.size
self.inuse[tp.name] = False
tp.reset_orientation()
调用求解器的邮件循环(当然是在初始化之后)是
start_time = time.time()
for tf in s.solve():
printit(s)
end_time = time.time()
delta_time = end_time - start_time
在回调版本中,只需一次调用即可解决该循环。
I have a class that solves an exact cover problem using a recursive, backtracking algorithm. Originally, I implemented the class with a callback function I passed to the object during initialization. This callback is invoked whenever a solution is found. In looking at someone else's implementation of the same problem, I saw that they were using yield statements to pass a solution out, in other words, their code was a python generator. I thought this was an interesting idea so I made a new version of my class to use yields. I then ran comparison tests between the two versions and, to my surprise, I found the generator version ran 5 times slower than the callback version. Note that, except for switching in a yield for a callback, the code is identical.
What is going on here? I'm speculating that, because a generator needs to save state information before yielding and then restore that state when restarting at the next call, it is this save/restore that is what makes the generator version run so much slower. If this is the case, how much state information is the generator having to save and restore?
Any ideas from the python experts?
--Edited 7:40 PDT
Here is the solver code which uses yield. Replace the first yield below with a call to the callback function and change the following loop with the second yield to just a recursive call to solve for the original version of this code.
def solve(self):
for tp in self.pieces:
if self.inuse[tp.name]: continue
self.inuse[tp.name] = True
while tp.next_orientation() is not None:
if tp.insert_piece():
self.n_trials += 1
self.pieces_in += 1
self.free_cells -= tp.size
if self.pieces_in == len(self.pieces) or self.free_cells == 0:
self.solutions += 1
self.haveSolution = True
yield True
self.haveSolution = False
else:
self.table.next_base_square()
for tf in self.solve():
yield tf
tp.remove_piece()
self.pieces_in -= 1
self.table.set_base_square(tp.base_square)
self.free_cells += tp.size
self.inuse[tp.name] = False
tp.reset_orientation()
The mail loop which invokes the solver (after initialization, of course) is
start_time = time.time()
for tf in s.solve():
printit(s)
end_time = time.time()
delta_time = end_time - start_time
In the callback version, the loop is gone with just a single call to solve.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我在评论中的意思是(“从递归函数中产生听起来需要额外的 for 循环才能将结果传递给调用者”)是这一行:
这些行递归地循环更深层递归阶段的结果。这意味着在递归的每个级别上都会迭代单个结果,从而导致大量不必要的循环。
让我用这个例子来说明:
正如您所看到的,这只是从 10 开始倒数,所以您期望迭代次数是线性的。但您可以看到,
n
呈二次方增长 -recurse(10)
循环超过 9 个项目,recurse(9)
循环超过 8 个项目,等等在。你拥有的项目越多,Python 在这些简单的代码上花费的时间就越多。回调完全避免了这个问题,所以我怀疑这是你的代码的问题。
PEP 380 的优化实现可以解决此问题(请参阅本段)。与此同时,我认为从递归函数中产生结果并不是一个好主意(至少如果它们深度递归的话),它们只是不能很好地协同工作。
What i meant in my comment, ("yielding from a recursive function sounds like it requires extra for loops to pass the results down to the caller") is this line:
These lines recursively loop over the results from the deeper recursion stages. That means that a single result is iterated over on each level of the recursion, resulting in a lot of unnecessary looping.
Let me illustrate with this example:
As you can see this simply counts down from 10, so you'd expect a a linear number of iterations. What you can see though is that
n
grows quadratically -recurse(10)
loops over 9 items,recurse(9)
over 8 items and so on.The more items you have, the more time Python spends on these simple lines. Callbacks completely avoid that problem, so I'd suspect that is the problem with your code.
A optimized implementation of PEP 380 could fix this (see this paragraph). In the meantime I don't think it's a good idea to yield from recursive functions (at least if they recurse deeply), they just don't work well together.