Python导入模块的优化
我正在阅读 David Beazley 的 Python 参考书,他提出了一个观点:
例如,如果您正在执行 很多平方根运算,它是 使用“from math import sqrt”更快 和 'sqrt(x)' 而不是输入 'math.sqrt(x)'。
和:
对于涉及大量使用的计算 方法或模块查找,它是 几乎总是更好地消除 通过放置属性查找 您想要执行的操作 首先是局部变量。
我决定尝试一下:
first()
def first():
from collections import defaultdict
x = defaultdict(list)
Second()
def second():
import collections
x = collections.defaultdict(list)
结果是:
2.15461492538
1.39850616455
诸如此类的优化对我来说可能并不重要。但我很好奇为什么比兹利所写的相反。请注意,存在 1 秒的差异,考虑到任务很微不足道,这很重要。
为什么会发生这种情况?
更新:
我得到的时间如下:
print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');
I am reading David Beazley's Python Reference book and he makes a point:
For example, if you were performing a
lot of square root operations, it is
faster to use 'from math import sqrt'
and 'sqrt(x)' rather than typing
'math.sqrt(x)'.
and:
For calculations involving heavy use
of methods or module lookups, it is
almost always better to eliminate the
attribute lookup by putting the
operation you want to perform into a
local variable first.
I decided to try it out:
first()
def first():
from collections import defaultdict
x = defaultdict(list)
second()
def second():
import collections
x = collections.defaultdict(list)
The results were:
2.15461492538
1.39850616455
Optimizations such as these probably don't matter to me. But I am curious as to why the opposite of what Beazley has written comes out to be true. And note that there is a difference of 1 second, which is singificant given the task is trivial.
Why is this happening?
UPDATE:
I am getting the timings like:
print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
from collections import defaultdict
和import collections
应该位于迭代计时循环之外,因为您不会重复执行它们。我猜想
from
语法必须比import
语法做更多的工作。使用此测试代码:
我得到结果:
这显示了重复导入的成本。
The
from collections import defaultdict
andimport collections
should be outside the iterated timing loops, since you won't repeat doing them.I guess that the
from
syntax has to do more work that theimport
syntax.Using this test code:
I get results:
Which shows how much the repeated imports cost.
我也会得到
first(.)
和second(.)
之间类似的比率,唯一的区别是计时是微秒级别的。我认为你的时间安排没有任何有用的衡量标准。尝试找出更好的测试用例!
更新:
FWIW,这里有一些测试来支持 David Beazley 的观点。
因此,
first()
比second()
慢 20% 左右。I'll get also similar ratios between
first(.)
andsecond(.)
, only difference is that the timings are in microsecond level.I don't think that your timings measure anything useful. Try to figure out better test cases!
Update:
FWIW, here is some tests to support David Beazley's point.
So
first()
is some 20% slower thansecond()
.first()
不会保存任何内容,因为仍然必须访问模块才能导入名称。另外,您没有给出计时方法,但给出了函数名称,似乎
first()
执行初始导入,它总是比后续导入长,因为必须编译和执行模块。first()
doesn't save anything, since the module must still be accessed in order to import the name.Also, you don't give your timing methodology but given the function names it seems that
first()
performs the initial import, which is always longer than subsequent imports since the module must be compiled and executed.我的猜测,您的测试是有偏差的,第二个实现从第一个已经加载模块的实现中获益,或者只是从最近加载的模块中获益。
你尝试了多少次?你有没有调换顺序等等。
My guess, your test is biased and the second implementation gains from the first one already having loaded the module, or just from having it loaded recently.
How many times did you try it? Did you switch up the order, etc..
还有阅读/理解源代码的效率问题。这是一个真实的例子(代码来自 stackoverflow 问题)
原始:
替代:
There is also the question of efficiency of reading/understanding the source code. Here's a real live example (code from a stackoverflow question)
Original:
Alternative:
像往常一样编写代码,导入模块并将其模块和常量作为
module.attribute
引用。然后,在函数前面加上用于绑定常量的 装饰器 或使用下面的bind_all_modules
函数绑定程序中的所有模块:Write your code as usual, importing a module and referencing its modules and constants as
module.attribute
. Then, either prefix your functions with the decorator for binding constants or bind all modules in your program by using thebind_all_modules
function below: