当前位置：文江博客话题详情

防止Python缓存导入的模块

发布于 2024-09-03 11:33:59 字数 1264 浏览 9 评论 0原文

在使用 IPython 在 Python 中开发一个大型项目（分为多个文件和文件夹）时，我遇到了缓存导入模块的麻烦。

问题是指令 import module 只读取该模块一次，即使该模块已更改！因此，每次我更改包中的某些内容时，我都必须退出并重新启动 IPython。痛苦。

有什么方法可以正确强制重新加载某些模块吗？或者，更好的是，以某种方式阻止 Python 缓存它们？

我尝试了几种方法，但没有一个有效。特别是我遇到了非常非常奇怪的错误，比如一些模块或变量神秘地等于 None...

我发现的唯一合理的资源是从 pyunit 重新加载 Python 模块，但我还没有检查过。我想要那样的东西。

一个好的替代方案是重新启动 IPython，或者以某种方式重新启动 Python 解释器。

那么，如果你使用Python进行开发，你找到了什么解决方案来解决这个问题呢？

编辑

为了澄清这一点：显然，我知道一些取决于模块先前状态的旧变量可能会保留下来。这对我来说没问题。为什么在 Python 中强制重新加载模块而不发生各种奇怪的错误如此困难？

更具体地说，如果我将整个模块放在 one 文件 module.py 中，那么以下代码可以正常工作：

import sys
try:
    del sys.modules['module']
except AttributeError:
    pass
import module

obj = module.my_class()

这段代码运行良好，我可以在几个月内不退出 IPython 进行开发。

但是，每当我的模块由多个子模块组成时，就会出现问题：

import os
for mod in ['module.submod1', 'module.submod2']:
    try:
        del sys.module[mod]
    except AttributeError:
        pass
# sometimes this works, sometimes not. WHY?

为什么对于 Python 来说，无论我将模块放在一个大文件中还是放在多个子模块中，情况会如此不同？为什么这种方法行不通？

原文

While developing a largeish project (split in several files and folders) in Python with IPython, I run into the trouble of cached imported modules.

The problem is that instructions import module only reads the module once, even if that module has changed! So each time I change something in my package, I have to quit and restart IPython. Painful.

Is there any way to properly force reloading some modules? Or, better, to somehow prevent Python from caching them?

I tried several approaches, but none works. In particular I run into really, really weird bugs, like some modules or variables mysteriously becoming equal to None...

The only sensible resource I found is Reloading Python modules, from pyunit, but I have not checked it. I would like something like that.

A good alternative would be for IPython to restart, or restart the Python interpreter somehow.

So, if you develop in Python, what solution have you found to this problem?

Edit

To make things clear: obviously, I understand that some old variables depending on the previous state of the module may stick around. That's fine by me. By why is that so difficult in Python to force reload a module without having all sort of strange errors happening?

More specifically, if I have my whole module in one file module.py then the following works fine:

import sys
try:
    del sys.modules['module']
except AttributeError:
    pass
import module

obj = module.my_class()

This piece of code works beautifully and I can develop without quitting IPython for months.

However, whenever my module is made of several submodules, hell breaks loose:

import os
for mod in ['module.submod1', 'module.submod2']:
    try:
        del sys.module[mod]
    except AttributeError:
        pass
# sometimes this works, sometimes not. WHY?

Why is that so different for Python whether I have my module in one big file or in several submodules? Why would that approach not work??

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

灼疼热情 2024-09-10 11:33:59

import 检查该模块是否在 sys.modules 中，如果是，则返回它。如果您希望导入从磁盘加载新模块，您可以先删除 sys.modules 中相应的键。

有一个 reload 内置函数，给定一个模块对象，它将从磁盘重新加载它，并将其放置在 sys.modules 中。编辑——实际上，它会从磁盘上的文件中重新编译代码，然后在现有模块的__dict__中重新评估它。与创建新模块对象可能非常不同。

迈克·格雷厄姆是对的；如果您有一些活动对象引用了您不再需要的模块的内容，那么正确重新加载就很困难。现有对象仍然会引用它们实例化的类，这是一个明显的问题，而且通过 from module import symbol 创建的所有引用仍然会指向旧版本模块中的任何对象。许多微妙错误的事情都是可能的。

编辑：我同意重启解释器是迄今为止最可靠的事情这一共识。但出于调试目的，我想您可以尝试以下操作。我确信在某些极端情况下这不起作用，但如果您没有在您的包中加载模块时做任何过于疯狂的事情（否则），它可能会很有用。

def reload_package(root_module):
    package_name = root_module.__name__

    # get a reference to each loaded module
    loaded_package_modules = dict([
        (key, value) for key, value in sys.modules.items() 
        if key.startswith(package_name) and isinstance(value, types.ModuleType)])

    # delete references to these loaded modules from sys.modules
    for key in loaded_package_modules:
        del sys.modules[key]

    # load each of the modules again; 
    # make old modules share state with new modules
    for key in loaded_package_modules:
        print 'loading %s' % key
        newmodule = __import__(key)
        oldmodule = loaded_package_modules[key]
        oldmodule.__dict__.clear()
        oldmodule.__dict__.update(newmodule.__dict__)

我非常简单地测试过，如下所示：

import email, email.mime, email.mime.application
reload_package(email)

打印：

reloading email.iterators
reloading email.mime
reloading email.quoprimime
reloading email.encoders
reloading email.errors
reloading email
reloading email.charset
reloading email.mime.application
reloading email._parseaddr
reloading email.utils
reloading email.mime.base
reloading email.message
reloading email.mime.nonmultipart
reloading email.base64mime

import checks to see if the module is in sys.modules, and if it is, it returns it. If you want import to load the module fresh from disk, you can delete the appropriate key in sys.modules first.

There is the reload builtin function which will, given a module object, reload it from disk and that will get placed in sys.modules. Edit -- actually, it will recompile the code from the file on the disk, and then re-evalute it in the existing module's __dict__. Something potentially very different than making a new module object.

Mike Graham is right though; getting reloading right if you have even a few live objects that reference the contents of the module you don't want anymore is hard. Existing objects will still reference the classes they were instantiated from is an obvious issue, but also all references created by means of from module import symbol will still point to whatever object from the old version of the module. Many subtly wrong things are possible.

Edit: I agree with the consensus that restarting the interpreter is by far the most reliable thing. But for debugging purposes, I guess you could try something like the following. I'm certain that there are corner cases for which this wouldn't work, but if you aren't doing anything too crazy (otherwise) with module loading in your package, it might be useful.

def reload_package(root_module):
    package_name = root_module.__name__

    # get a reference to each loaded module
    loaded_package_modules = dict([
        (key, value) for key, value in sys.modules.items() 
        if key.startswith(package_name) and isinstance(value, types.ModuleType)])

    # delete references to these loaded modules from sys.modules
    for key in loaded_package_modules:
        del sys.modules[key]

    # load each of the modules again; 
    # make old modules share state with new modules
    for key in loaded_package_modules:
        print 'loading %s' % key
        newmodule = __import__(key)
        oldmodule = loaded_package_modules[key]
        oldmodule.__dict__.clear()
        oldmodule.__dict__.update(newmodule.__dict__)

Which I very briefly tested like so:

import email, email.mime, email.mime.application
reload_package(email)

printing:

reloading email.iterators
reloading email.mime
reloading email.quoprimime
reloading email.encoders
reloading email.errors
reloading email
reloading email.charset
reloading email.mime.application
reloading email._parseaddr
reloading email.utils
reloading email.mime.base
reloading email.message
reloading email.mime.nonmultipart
reloading email.base64mime

回复收藏 0 原文

梦幻之岛 2024-09-10 11:33:59

退出并重新启动解释器是最好的解决方案。任何类型的实时重新加载或无缓存策略都无法无缝工作，因为来自不再存在的模块的对象可能存在，并且模块有时会存储状态，并且因为即使您的用例确实允许热重新加载，但考虑起来也太复杂了值得。

回复收藏 0 原文

野侃 2024-09-10 11:33:59

IPython 附带了 autoreload 扩展，它会在每次导入之前自动重复导入函数调用。它至少在简单的情况下有效，但不要过度依赖它：根据我的经验，仍然需要不时重新启动解释器，特别是当代码更改仅发生在间接导入的代码上时。

链接页面的用法示例：

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: from foo import some_function

In [4]: some_function()
Out[4]: 42

In [5]: # open foo.py in an editor and change some_function to return 43

In [6]: some_function()
Out[6]: 43

With IPython comes the autoreload extension that automatically repeats an import before each function call. It works at least in simple cases, but don't rely too much on it: in my experience, an interpreter restart is still required from time to time, especially when code changes occur only on indirectly imported code.

Usage example from the linked page:

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: from foo import some_function

In [4]: some_function()
Out[4]: 42

In [5]: # open foo.py in an editor and change some_function to return 43

In [6]: some_function()
Out[6]: 43

回复收藏 0 原文

以歌曲疗慰 2024-09-10 11:33:59

对于 Python 版本 3.4 及更高版本，

import importlib 
importlib.reload(<package_name>) 
from <package_name> import <method_name>

请参阅下面的文档了解详细信息。

For Python version 3.4 and above

import importlib 
importlib.reload(<package_name>) 
from <package_name> import <method_name>

Refer below documentation for details.

回复收藏 0 原文

拥抱我好吗 2024-09-10 11:33:59

这里已经有一些非常好的答案，但是值得了解 dreload，它是 IPython 中可用的一个函数，它的作用是“深度重新加载”。从文档中：

IPython.lib.deepreload 模块允许您递归地重新加载
模块：对其任何依赖项所做的更改都将重新加载
无需退出。要开始使用它，请执行以下操作：

http://ipython.org/ipython- doc/dev/interactive/reference.html#dreload

它可以在 IPython 笔记本中作为“全局”使用（至少是我的版本，运行 v2.0）。

华泰

回复收藏 0 原文

眼藏柔 2024-09-10 11:33:59

您可以使用 PEP 302 中描述的导入钩子机制来加载模块本身，但某种代理对象，它允许您对底层模块对象执行任何您想要的操作 - 重新加载它，删除对它的引用等。

额外的好处是您当前现有的代码不需要更改，并且可以从该附加模块功能中剥离出来代码中的一个点 - 您实际上将 finder 添加到 sys.meta_path 中。

关于实现的一些想法：创建查找器，该查找器将同意查找除内置模块之外的任何模块（您与内置模块无关），然后创建将返回从 types.ModuleType 子类化的代理对象的加载器真实模块对象的。请注意，加载器对象不会被强制创建对加载到 sys.modules 中的模块的显式引用，但强烈鼓励这样做，因为正如您已经看到的那样，它可能会意外失败。代理对象应该捕获所有 __getattr__、__setattr__ 和 __delattr__ 并将其转发到它所引用的底层真实模块。您可能不需要定义 __getattribute__ 因为您不会使用代理方法隐藏真实的模块内容。因此，现在您应该以某种方式与代理进行通信 - 您可以创建一些特殊的方法来删除底层引用，然后导入模块，从返回的代理中提取引用，删除代理并保留对重新加载的模块的引用。唷，看起来很可怕，但应该可以解决你的问题，而不必每次都重新加载 Python。

回复收藏 0 原文

凌乱心跳 2024-09-10 11:33:59

我在我的项目中使用PythonNet。幸运的是，我发现有一个命令可以完美解决这个问题。

using (Py.GIL())
        {
            dynamic mod = Py.Import(this.moduleName);
            if (mod == null)
                throw new Exception( string.Format("Cannot find module {0}. Python script may not be complied successfully or module name is illegal.", this.moduleName));

            // This command works perfect for me!
            PythonEngine.ReloadModule(mod);

            dynamic instance = mod.ClassName();

I am using PythonNet in my project. Fortunately, I found there is a command which can perfectly solve this problem.

using (Py.GIL())
        {
            dynamic mod = Py.Import(this.moduleName);
            if (mod == null)
                throw new Exception( string.Format("Cannot find module {0}. Python script may not be complied successfully or module name is illegal.", this.moduleName));

            // This command works perfect for me!
            PythonEngine.ReloadModule(mod);

            dynamic instance = mod.ClassName();

回复收藏 0 原文

回眸一遍 2024-09-10 11:33:59

在生产环境中退出和重新启动时要三思而后行

重新启动是通过使用 imp 的重新加载

import moduleA, moduleB
from imp import reload
reload (moduleB)

Think twice for quitting and restarting in production

The easy solution without quitting & restarting is by using the reload from imp

import moduleA, moduleB
from imp import reload
reload (moduleB)

回复收藏 0 原文

~没有更多了~

关于作者

北音执念

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

防止Python缓存导入的模块

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

对于 Python 版本 3.4 及更高版本，

For Python version 3.4 and above

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

防止Python缓存导入的模块

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

对于 Python 版本 3.4 及更高版本，

For Python version 3.4 and above

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。