防止Python缓存导入的模块
在使用 IPython 在 Python 中开发一个大型项目(分为多个文件和文件夹)时,我遇到了缓存导入模块的麻烦。
问题是指令 import module
只读取该模块一次,即使该模块已更改!因此,每次我更改包中的某些内容时,我都必须退出并重新启动 IPython。痛苦。
有什么方法可以正确强制重新加载某些模块吗?或者,更好的是,以某种方式阻止 Python 缓存它们?
我尝试了几种方法,但没有一个有效。特别是我遇到了非常非常奇怪的错误,比如一些模块或变量神秘地等于 None
...
我发现的唯一合理的资源是 从 pyunit 重新加载 Python 模块,但我还没有检查过。我想要那样的东西。
一个好的替代方案是重新启动 IPython,或者以某种方式重新启动 Python 解释器。
那么,如果你使用Python进行开发,你找到了什么解决方案来解决这个问题呢?
编辑
为了澄清这一点:显然,我知道一些取决于模块先前状态的旧变量可能会保留下来。这对我来说没问题。为什么在 Python 中强制重新加载模块而不发生各种奇怪的错误如此困难?
更具体地说,如果我将整个模块放在 one 文件 module.py
中,那么以下代码可以正常工作:
import sys
try:
del sys.modules['module']
except AttributeError:
pass
import module
obj = module.my_class()
这段代码运行良好,我可以在几个月内不退出 IPython 进行开发。
但是,每当我的模块由多个子模块组成时,就会出现问题:
import os
for mod in ['module.submod1', 'module.submod2']:
try:
del sys.module[mod]
except AttributeError:
pass
# sometimes this works, sometimes not. WHY?
为什么对于 Python 来说,无论我将模块放在一个大文件中还是放在多个子模块中,情况会如此不同?为什么这种方法行不通?
While developing a largeish project (split in several files and folders) in Python with IPython, I run into the trouble of cached imported modules.
The problem is that instructions import module
only reads the module once, even if that module has changed! So each time I change something in my package, I have to quit and restart IPython. Painful.
Is there any way to properly force reloading some modules? Or, better, to somehow prevent Python from caching them?
I tried several approaches, but none works. In particular I run into really, really weird bugs, like some modules or variables mysteriously becoming equal to None
...
The only sensible resource I found is Reloading Python modules, from pyunit, but I have not checked it. I would like something like that.
A good alternative would be for IPython to restart, or restart the Python interpreter somehow.
So, if you develop in Python, what solution have you found to this problem?
Edit
To make things clear: obviously, I understand that some old variables depending on the previous state of the module may stick around. That's fine by me. By why is that so difficult in Python to force reload a module without having all sort of strange errors happening?
More specifically, if I have my whole module in one file module.py
then the following works fine:
import sys
try:
del sys.modules['module']
except AttributeError:
pass
import module
obj = module.my_class()
This piece of code works beautifully and I can develop without quitting IPython for months.
However, whenever my module is made of several submodules, hell breaks loose:
import os
for mod in ['module.submod1', 'module.submod2']:
try:
del sys.module[mod]
except AttributeError:
pass
# sometimes this works, sometimes not. WHY?
Why is that so different for Python whether I have my module in one big file or in several submodules? Why would that approach not work??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
import
检查该模块是否在sys.modules
中,如果是,则返回它。如果您希望导入从磁盘加载新模块,您可以先删除 sys.modules 中相应的键。有一个
reload
内置函数,给定一个模块对象,它将从磁盘重新加载它,并将其放置在sys.modules
中。 编辑——实际上,它会从磁盘上的文件中重新编译代码,然后在现有模块的__dict__
中重新评估它。与创建新模块对象可能非常不同。迈克·格雷厄姆是对的;如果您有一些活动对象引用了您不再需要的模块的内容,那么正确重新加载就很困难。现有对象仍然会引用它们实例化的类,这是一个明显的问题,而且通过
from module import symbol
创建的所有引用仍然会指向旧版本模块中的任何对象。许多微妙错误的事情都是可能的。编辑:我同意重启解释器是迄今为止最可靠的事情这一共识。但出于调试目的,我想您可以尝试以下操作。我确信在某些极端情况下这不起作用,但如果您没有在您的包中加载模块时做任何过于疯狂的事情(否则),它可能会很有用。
我非常简单地测试过,如下所示:
打印:
import
checks to see if the module is insys.modules
, and if it is, it returns it. If you want import to load the module fresh from disk, you can delete the appropriate key insys.modules
first.There is the
reload
builtin function which will, given a module object, reload it from disk and that will get placed insys.modules
. Edit -- actually, it will recompile the code from the file on the disk, and then re-evalute it in the existing module's__dict__
. Something potentially very different than making a new module object.Mike Graham is right though; getting reloading right if you have even a few live objects that reference the contents of the module you don't want anymore is hard. Existing objects will still reference the classes they were instantiated from is an obvious issue, but also all references created by means of
from module import symbol
will still point to whatever object from the old version of the module. Many subtly wrong things are possible.Edit: I agree with the consensus that restarting the interpreter is by far the most reliable thing. But for debugging purposes, I guess you could try something like the following. I'm certain that there are corner cases for which this wouldn't work, but if you aren't doing anything too crazy (otherwise) with module loading in your package, it might be useful.
Which I very briefly tested like so:
printing:
退出并重新启动解释器是最好的解决方案。任何类型的实时重新加载或无缓存策略都无法无缝工作,因为来自不再存在的模块的对象可能存在,并且模块有时会存储状态,并且因为即使您的用例确实允许热重新加载,但考虑起来也太复杂了值得。
Quitting and restarting the interpreter is the best solution. Any sort of live reloading or no-caching strategy will not work seamlessly because objects from no-longer-existing modules can exist and because modules sometimes store state and because even if your use case really does allow hot reloading it's too complicated to think about to be worth it.
IPython 附带了 autoreload 扩展,它会在每次导入之前自动重复导入函数调用。它至少在简单的情况下有效,但不要过度依赖它:根据我的经验,仍然需要不时重新启动解释器,特别是当代码更改仅发生在间接导入的代码上时。
链接页面的用法示例:
With IPython comes the autoreload extension that automatically repeats an import before each function call. It works at least in simple cases, but don't rely too much on it: in my experience, an interpreter restart is still required from time to time, especially when code changes occur only on indirectly imported code.
Usage example from the linked page:
对于 Python 版本 3.4 及更高版本,
请参阅下面的文档了解详细信息。
For Python version 3.4 and above
Refer below documentation for details.
这里已经有一些非常好的答案,但是值得了解 dreload,它是 IPython 中可用的一个函数,它的作用是“深度重新加载”。从文档中:
http://ipython.org/ipython- doc/dev/interactive/reference.html#dreload
它可以在 IPython 笔记本中作为“全局”使用(至少是我的版本,运行 v2.0)。
华泰
There are some really good answers here already, but it is worth knowing about dreload, which is a function available in IPython which does as "deep reload". From the documentation:
http://ipython.org/ipython-doc/dev/interactive/reference.html#dreload
It is available as a "global" in IPython notebook (at least my version, which is running v2.0).
HTH
您可以使用 PEP 302 中描述的导入钩子机制来加载模块本身,但某种代理对象,它允许您对底层模块对象执行任何您想要的操作 - 重新加载它,删除对它的引用等。
额外的好处是您当前现有的代码不需要更改,并且可以从该附加模块功能中剥离出来代码中的一个点 - 您实际上将 finder 添加到 sys.meta_path 中。
关于实现的一些想法:创建查找器,该查找器将同意查找除内置模块之外的任何模块(您与内置模块无关),然后创建将返回从
types.ModuleType
子类化的代理对象的加载器真实模块对象的。请注意,加载器对象不会被强制创建对加载到 sys.modules 中的模块的显式引用,但强烈鼓励这样做,因为正如您已经看到的那样,它可能会意外失败。代理对象应该捕获所有__getattr__
、__setattr__
和__delattr__
并将其转发到它所引用的底层真实模块。您可能不需要定义__getattribute__
因为您不会使用代理方法隐藏真实的模块内容。因此,现在您应该以某种方式与代理进行通信 - 您可以创建一些特殊的方法来删除底层引用,然后导入模块,从返回的代理中提取引用,删除代理并保留对重新加载的模块的引用。唷,看起来很可怕,但应该可以解决你的问题,而不必每次都重新加载 Python。You can use import hook machinery described in PEP 302 to load not modules themself but some kind of proxy object that will allow you to do anything you want with underlying module object — reload it, drop reference to it etc.
Additional benefit is that your currently existing code will not require change and this additional module functionality can be torn off from a single point in code — where you actually add finder into
sys.meta_path
.Some thoughts on implementing: create finder that will agree to find any module, except of builtin (you have nothing to do with builtin modules), then create loader that will return proxy object subclassed from
types.ModuleType
instead of real module object. Note that loader object are not forced to create explicit references to loaded modules intosys.modules
, but it's strongly encouraged, because, as you have already seen, it may fail unexpectably. Proxy object should catch and forward all__getattr__
,__setattr__
and__delattr__
to underlying real module it's keeping reference to. You will probably don't need to define__getattribute__
because of you would not hide real module contents with your proxy methods. So, now you should communicate with proxy in some way — you can create some special method to drop underlying reference, then import module, extract reference from returned proxy, drop proxy and hold reference to reloaded module. Phew, looks scary, but should fix your problem without reloading Python each time.我在我的项目中使用PythonNet。幸运的是,我发现有一个命令可以完美解决这个问题。
I am using PythonNet in my project. Fortunately, I found there is a command which can perfectly solve this problem.
在生产环境中退出和重新启动时要三思而后行
重新启动是通过使用 imp 的重新加载
Think twice for quitting and restarting in production
The easy solution without quitting & restarting is by using the reload from imp