我有一些任务需要解决,目前最重要的部分是使脚本尽可能高效。我试图优化的元素之一是其中一个函数内的记忆。
所以我的问题是:以下 3-4 种方法中哪一种是在 Python 中实现记忆化最有效/最快的方法?
我提供的代码仅作为示例 - 如果其中一种方法更有效,但不是我提到的情况,请分享您所知道的。
解决方案 1 - 使用外部作用域中的可变变量
此解决方案通常显示为示例记忆,但我不确定它的效率如何。我听说使用全局变量(在这种情况下它是来自外部的变量,而不是全局范围的变量)效率较低。
def main():
memo = {}
def power_div(n):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
解决方案 2 - 使用默认的可变参数
我发现过去使用默认可变参数从外部作用域传递变量时,Python 首先在本地作用域中搜索变量,然后在全局作用域中搜索变量,从而跳过非本地作用域。范围(在本例中为函数 main()
内的范围)。因为默认参数仅在定义函数时初始化,并且只能在内部函数内部访问,所以也许它更有效?
def main():
def power_div(n, memo={}):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
或者也许以下版本(实际上是解决方案 1 和 2 的组合)更有效?
def main():
memo = {}
def power_div(n, memo=memo):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
解决方案 3 - 函数的属性
这是 Python 中另一个非常常见的记忆化示例 - 记忆化对象存储为函数本身的属性。
def main():
def power_div(n):
memo = power_div.memo
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
总结
我对您对上述四种记忆解决方案的看法非常感兴趣。同样重要的是,使用记忆的函数位于另一个函数内。
我知道还有其他记忆化解决方案(例如 Memoize
装饰器),但我很难相信这是比上面列出的更有效的解决方案。如果我错了请纠正我。
提前致谢。
I have some task to solve and the most important part at the moment is to make the script as time-efficient as possible. One of the elements I am trying to optimize is memoization within one of the functions.
So my question is: Which of the following 3-4 methods is the most efficient / fastest method of implementing memoization in Python?
I have provided code only as an example - if one of the methods is more efficient, but not in the case I mentioned, please share what you know.
Solution 1 - using mutable variable from outer scope
This solution is often shown as the example memoization, but I am not sure how efficient it is. I have heard that using global variables (in this case it is variable from outer, not global scope) is less efficient.
def main():
memo = {}
def power_div(n):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
Solution 2 - using default, mutable argument
I have found somewhere that using default mutable arguments has been used in the past to pass variables from outer scope, when Python searched the variable first in the local scope, then in the global scope, skipping the nonlocal scope (in this case the scope within function main()
). Because default argument is initialized only at the time function is defined and is accessible only inside the inner function, maybe it is thus more efficient?
def main():
def power_div(n, memo={}):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
Or maybe the following version (being in fact a combination of solutions 1&2) is more efficient?
def main():
memo = {}
def power_div(n, memo=memo):
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
Solution 3 - function's attribute
This is another quite common example of memoization in Python - the memoization object is stored as an attribute of the function itself.
def main():
def power_div(n):
memo = power_div.memo
try:
return memo[n]
except (KeyError):
memo[n] = (n ** 2) % 4 # example expression, should not matter
return memo[n]
# extensive usage of power_div() here
Summary
I am very interested in your opinions about the four above solutions for memoization. It is important also, that the function that uses memoization is within another function.
I know that there are also other solutions for memoization (such as Memoize
decorator), but it is hard for me to believe that this is more efficient solution than these listed above. Correct me if I am wrong.
Thanks in advance.
发布评论
评论(2)
不同风格的变量访问已在以下位置进行了计时和比较:http://code.activestate.com/recipes/577834-compare-speeds-of- Different-kinds-of-access-to-var
这是一个快速总结:本地访问击败了非本地(嵌套范围),后者击败了全局访问(模块范围),后者击败了对内置函数的访问。
您的解决方案#2(具有本地访问权限)应该获胜。解决方案#3 有一个慢点查找(需要字典查找)。解决方案#1 使用非本地(嵌套范围)访问,该访问使用单元变量(比字典查找更快,但比本地慢)。
另请注意,KeyError 异常类是全局查找,可以通过本地化来加速。您可以完全替换 try/ except 并使用
memo.get(n, sentinel)
代替。甚至可以通过使用绑定方法来加速。当然,最简单的速度提升可能只是通过尝试 pypy :-)简而言之,有很多方法可以调整这段代码。只要确保它值得。
The different styles of variable access have already been timed and compared at: http://code.activestate.com/recipes/577834-compare-speeds-of-different-kinds-of-access-to-var
Here's a quick summary: local access beats nonlocal (nested scopes) which beat global access (module scope) which beats access to builtins.
Your solution #2 (with local access) should win. Solution #3 has a slow-dotted lookup (which requires a dictionary lookup). Solution #1 uses nonlocal (nested scope) access which uses cell-variables (faster than a dict lookup but slower than locals).
Also note, the KeyError exception class is a global lookup and can be sped-up by localizing it. You could replace the try/except entirely and use a
memo.get(n, sentinel)
instead. And even that could be sped-up by using a bound method. Of course, your easiest speed boost may just come from trying out pypy :-)In short, there are many ways to tweak this code. Just make sure it's worth it.
为了让那些在寻找在 python 中进行记忆的方法时偶然发现这个问题的人受益,我推荐 fastcache。
它适用于 python 2 和 3,比上述任何方法都更快,并提供限制缓存大小的选项,以便它不会无意中变得太大:
安装 fastcache 很简单,使用
pip
:或
conda
:For the benefit of people who stumble on this question while looking for a way to do memoization in python, I recommend fastcache.
It works on python 2 and 3, is faster than any of the methods described above, and gives the option to limit cache size so that it does not inadvertently get too big:
Installing fastcache is simple, using
pip
:or
conda
: