如何避免 Python 模块全局初始化的延迟?
我正在尝试优化用 python 编写的 Web 应用程序的一般加载时间。我的应用程序使用了很多模块,其中一些模块可能是给定请求实际需要的,也可能不是。
由于页面加载时间是最终用户感知网站质量的一个重要因素,因此我正在尝试减少加载可能不必要的模块的影响 - 特别是尝试减少初始化全局变量所需的时间(和内存)根本不需要。
简而言之,我的目标是:
- 尽可能减少模块初始化时间(而不是CPU使用率)。
- 减少不需要的全局变量占用的内存。
为了说明这一点,这里有一个简单的模块示例:
COMMON = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
构建 COMMON
集需要时间 - 如果不使用 COMMON
,那就会浪费加载时间和内存。
显然对于单个模块/全局来说,成本可以忽略不计,但是如果您有 100 个模块和 100 个变量怎么办?
加快速度的一种方法是延迟初始化,如下所示:
__cache_common = None
def getCommon():
global __cache_common
# not use before
if __cache_common is None:
__cache_common = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
# get cached value
return __cache_common
它可以节省加载时间和内存,从而牺牲一些 CPU。
我尝试了一些其他技术(见下文),其中两种技术比上面的简单缓存要快一些。
我可以使用另一种技术来进一步减少可能不会在给定请求中使用的模块和全局变量的加载时间吗?
到目前为止我尝试过的方法需要Python 2.6+:
from timeit import Timer
__repeat = 1000000
__cache = None
def getCache():
return __cache
def getCacheTest():
for i in range(__repeat):
getCache()
def getLocal():
return set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
def getLocalTest():
for i in range(__repeat):
getLocal()
def getLazyIf():
global __cache
if __cache is None:
__cache = getLocal()
return __cache
def getLazyIfTest():
for i in range(__repeat):
getLazyIf()
def __realLazy():
return __cache
def getLazyDynamic():
global __cache, getLazyDynamic
__cache = getLocal()
getLazyDynamic = __realLazy
return __cache
def getLazyDynamicTest():
for i in range(__repeat):
getLazyDynamic()
def getLazyDynamic2():
global __cache, getLazyDynamic2
__cache = getLocal()
def __realLazy2():
return __cache
getLazyDynamic2 = __realLazy2
return __cache
def getLazyDynamic2Test():
for i in range(__repeat):
getLazyDynamic2()
print sum(Timer(getCacheTest).repeat(3, 1)), getCacheTest, 'raw access'
print sum(Timer(getLocalTest).repeat(3, 1)), getLocalTest, 'repeat'
print sum(Timer(getLazyIfTest).repeat(3, 1)), getLazyIfTest, 'conditional'
print sum(Timer(getLazyDynamicTest).repeat(3, 1)), getLazyDynamicTest, 'hook'
print sum(Timer(getLazyDynamic2Test).repeat(3, 1)), getLazyDynamic2Test, 'scope hook'
使用Python 2.7,我获取这些时间(最好是没有范围的钩子):
1.01902420559 <function getCacheTest at 0x012AE170> raw access
5.40701374057 <function getLocalTest at 0x012AE1F0> repeat
1.39493902158 <function getLazyIfTest at 0x012AE270> conditional
1.06692051643 <function getLazyDynamicTest at 0x012AE330> hook
1.15909591862 <function getLazyDynamic2Test at 0x012AE3B0> scope hook
I'm trying to optimize the general load time of a web application written in python. My application uses a lot of modules, some of which might or might not be actually needed for a given request.
Since page load time is an important factor of the end-user perceived quality of a site, I'm trying to reduce the impact of loading possibly unnecessary modules - especially, trying to reduce the time (and memory) required to initialize globals that might not be needed at all.
Simply put, my goals are:
- To reduce module initialization time as much as possible (not CPU usage).
- To reduce memory taken by un-need global variables.
To illustrate, here's a trivial module example:
COMMON = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
It takes time to build set for COMMON
- if COMMON
will be not used, that's a waste of load time and memory.
Obviously for a single module/global, the cost is negligible, but what if you have 100 modules with 100 variables?
One approach to make this faster is to delay initialization like this:
__cache_common = None
def getCommon():
global __cache_common
# not use before
if __cache_common is None:
__cache_common = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
# get cached value
return __cache_common
It saves load time and memory, sacrificing some CPU.
I've tried a few other techniques (see below), two of which are a bit faster than the simple caching above.
Is there another technique I could use to further reduce load time for modules and globals that might not be used on a given request?
Approaches I have tried so far, requires Python 2.6+:
from timeit import Timer
__repeat = 1000000
__cache = None
def getCache():
return __cache
def getCacheTest():
for i in range(__repeat):
getCache()
def getLocal():
return set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
def getLocalTest():
for i in range(__repeat):
getLocal()
def getLazyIf():
global __cache
if __cache is None:
__cache = getLocal()
return __cache
def getLazyIfTest():
for i in range(__repeat):
getLazyIf()
def __realLazy():
return __cache
def getLazyDynamic():
global __cache, getLazyDynamic
__cache = getLocal()
getLazyDynamic = __realLazy
return __cache
def getLazyDynamicTest():
for i in range(__repeat):
getLazyDynamic()
def getLazyDynamic2():
global __cache, getLazyDynamic2
__cache = getLocal()
def __realLazy2():
return __cache
getLazyDynamic2 = __realLazy2
return __cache
def getLazyDynamic2Test():
for i in range(__repeat):
getLazyDynamic2()
print sum(Timer(getCacheTest).repeat(3, 1)), getCacheTest, 'raw access'
print sum(Timer(getLocalTest).repeat(3, 1)), getLocalTest, 'repeat'
print sum(Timer(getLazyIfTest).repeat(3, 1)), getLazyIfTest, 'conditional'
print sum(Timer(getLazyDynamicTest).repeat(3, 1)), getLazyDynamicTest, 'hook'
print sum(Timer(getLazyDynamic2Test).repeat(3, 1)), getLazyDynamic2Test, 'scope hook'
With Python 2.7, I get these timings (the best is hook without scope):
1.01902420559 <function getCacheTest at 0x012AE170> raw access
5.40701374057 <function getLocalTest at 0x012AE1F0> repeat
1.39493902158 <function getLazyIfTest at 0x012AE270> conditional
1.06692051643 <function getLazyDynamicTest at 0x012AE330> hook
1.15909591862 <function getLazyDynamic2Test at 0x012AE3B0> scope hook
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
import 语句执行模块,因此您不应该四处更改其语义。
将 import 语句塞进需要它们的函数或方法中怎么样?这样它们只会在需要时发生,而不是在应用程序启动时发生。
全局变量也是如此——将它们变成类静态变量或其他东西。无论如何,拥有大量全局变量都是不好的风格。
但为什么这还是一个问题呢?您是否真的包含了如此多的模块,以至于仅仅找到它们就会减慢速度,或者某些包含的软件包是否进行了大量昂贵的初始化(例如,打开连接)?我的钱在第二个。如果您编写了导致速度变慢的模块,请考虑将初始化包装到适当的构造函数中。
An import statement executes the module, so you shouldn't be going around changing its semantics.
How about you just tuck your import statements inside the functions or methods that need them? That way they'll only happen when they're needed, not at application startup.
Ditto for the globals-- turn them into class statics or something. Having lots of globals is bad style anyway.
But why is this even a problem? Are you really including so many modules that simply finding them slows things down, or are some of the included packages doing a lot of costly initialization (e.g., opening connections)? My money is on the second. If you have written the modules responsible for the slow-down, look into wrapping the initialization into appropriate constructors.