import 语句应该始终位于模块的顶部吗?

发布于 2024-07-06 13:40:48 字数 565 浏览 5 评论 0原文

PEP 8 指出:

导入始终放在文件的顶部,紧接在任何模块注释和文档字符串之后,以及模块全局变量和常量之前。

但是,如果我导入的类/方法/函数仅在极少数情况下使用,那么在需要时进行导入肯定会更有效吗?

这不是

class SomeClass(object):

    def not_often_called(self)
        from datetime import datetime
        self.datetime = datetime.now()

比这更有效率吗?

from datetime import datetime

class SomeClass(object):

    def not_often_called(self)
        self.datetime = datetime.now()

PEP 8 states:

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

However if the class/method/function that I am importing is only used in rare cases, surely it is more efficient to do the import when it is needed?

Isn't this:

class SomeClass(object):

    def not_often_called(self)
        from datetime import datetime
        self.datetime = datetime.now()

more efficient than this?

from datetime import datetime

class SomeClass(object):

    def not_often_called(self)
        self.datetime = datetime.now()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(22

┾廆蒐ゝ 2024-07-13 13:40:48

模块导入速度相当快,但不是即时的。 这意味着:

  • 将导入放在模块的顶部是可以的,因为这是一个微不足道的成本,只需支付一次。
  • 将导入放入函数中将导致对该函数的调用花费更长的时间。

因此,如果您关心效率,请将进口放在首位。 仅当您的分析表明有帮助时,才将它们移至函数中(您确实分析了哪些地方最能提高性能,对吗??)


我见过执行延迟导入的最佳原因是:

  • 可选图书馆支持。 如果您的代码有多个使用不同库的路径,如果未安装可选库,也不会中断。
  • 在插件的 __init__.py 中,可能会导入但未实际使用。 例如 Bazaar 插件,它使用 bzrlib 的延迟加载框架。

Module importing is quite fast, but not instant. This means that:

  • Putting the imports at the top of the module is fine, because it's a trivial cost that's only paid once.
  • Putting the imports within a function will cause calls to that function to take longer.

So if you care about efficiency, put the imports at the top. Only move them into a function if your profiling shows that would help (you did profile to see where best to improve performance, right??)


The best reasons I've seen to perform lazy imports are:

  • Optional library support. If your code has multiple paths that use different libraries, don't break if an optional library is not installed.
  • In the __init__.py of a plugin, which might be imported but not actually used. Examples are Bazaar plugins, which use bzrlib's lazy-loading framework.
安人多梦 2024-07-13 13:40:48

将 import 语句放在函数内部可以防止循环依赖。
例如,如果您有 2 个模块,X.py 和 Y.py,并且它们都需要相互导入,那么当您导入其中一个模块时,这将导致循环依赖,从而导致无限循环。 如果您将 import 语句移到其中一个模块中,那么在调用该函数之前,它不会尝试导入另一个模块,并且该模块已经被导入,因此不会出现无限循环。 阅读此处了解更多信息 - effbot.org /zone/import-confusion.htm

Putting the import statement inside of a function can prevent circular dependencies.
For example, if you have 2 modules, X.py and Y.py, and they both need to import each other, this will cause a circular dependency when you import one of the modules causing an infinite loop. If you move the import statement in one of the modules then it won't try to import the other module till the function is called, and that module will already be imported, so no infinite loop. Read here for more - effbot.org/zone/import-confusion.htm

御守 2024-07-13 13:40:48

我采用的做法是将所有导入放入使用它们的函数中,而不是放在模块的顶部。

我得到的好处是能够更可靠地进行重构。 当我将某个功能从一个模块移动到另一个模块时,我知道该功能将继续使用其所有遗留的测试完好无损。 如果我的导入位于模块的顶部,那么当我移动函数时,我发现我最终会花费大量时间来完成新模块的导入并使其最小化。 重构 IDE 可能会使这变得无关紧要。

正如其他地方提到的,存在速度损失。 我在我的应用程序中对此进行了测量,发现它对于我的目的来说微不足道。

能够预先查看所有模块依赖项而无需借助搜索(例如 grep)也很不错。 然而,我关心模块依赖关系的原因通常是因为我正在安装、重构或移动包含多个文件的整个系统,而不仅仅是单个模块。 在这种情况下,我无论如何都会执行全局搜索,以确保我具有系统级依赖项。 所以我还没有找到全局导入来帮助我在实践中理解系统。

我通常将 sys 的导入放在 if __name__=='__main__' 检查中,然后传递参数(如 sys.argv[1:]) 到 main() 函数。 这允许我在尚未导入 sys 的上下文中使用 main

I have adopted the practice of putting all imports in the functions that use them, rather than at the top of the module.

The benefit I get is the ability to refactor more reliably. When I move a function from one module to another, I know that the function will continue to work with all of its legacy of testing intact. If I have my imports at the top of the module, when I move a function, I find that I end up spending a lot of time getting the new module's imports complete and minimal. A refactoring IDE might make this irrelevant.

There is a speed penalty as mentioned elsewhere. I have measured this in my application and found it to be insignificant for my purposes.

It is also nice to be able to see all module dependencies up front without resorting to search (e.g. grep). However, the reason I care about module dependencies is generally because I'm installing, refactoring, or moving an entire system comprising multiple files, not just a single module. In that case, I'm going to perform a global search anyway to make sure I have the system-level dependencies. So I have not found global imports to aid my understanding of a system in practice.

I usually put the import of sys inside the if __name__=='__main__' check and then pass arguments (like sys.argv[1:]) to a main() function. This allows me to use main in a context where sys has not been imported.

眼中杀气 2024-07-13 13:40:48

大多数时候,这对于清晰和明智的做法很有用,但情况并非总是如此。 下面是模块导入可能位于其他地方的一些情况示例。

首先,您可以拥有一个具有以下形式的单元测试的模块:

if __name__ == '__main__':
    import foo
    aa = foo.xyz()         # initiate something for the test

其次,您可能需要在运行时有条件地导入一些不同的模块。

if [condition]:
    import foo as plugin_api
else:
    import bar as plugin_api
xx = plugin_api.Plugin()
[...]

可能还有其他情况,您可能会将导入放置在代码的其他部分。

Most of the time this would be useful for clarity and sensible to do but it's not always the case. Below are a couple of examples of circumstances where module imports might live elsewhere.

Firstly, you could have a module with a unit test of the form:

if __name__ == '__main__':
    import foo
    aa = foo.xyz()         # initiate something for the test

Secondly, you might have a requirement to conditionally import some different module at runtime.

if [condition]:
    import foo as plugin_api
else:
    import bar as plugin_api
xx = plugin_api.Plugin()
[...]

There are probably other situations where you might place imports in other parts in the code.

青丝拂面 2024-07-13 13:40:48

当函数被调用零次或一次时,第一个变体确实比第二个变体更有效。 然而,对于第二次及后续调用,“导入每个调用”方法实际上效率较低。 请参阅此链接,了解结合了两者优点的延迟加载技术通过“惰性导入”来实现。

但除了效率之外,还有其他原因让您更喜欢其中一种。 一种方法是让阅读代码的人更清楚地了解该模块所具有的依赖关系。 它们还具有非常不同的失败特征——如果没有“datetime”模块,第一个将在加载时失败,而第二个在调用方法之前不会失败。

添加注释:在 IronPython 中,导入可能比 CPython 中昂贵得多,因为代码基本上是在导入时进行编译的。

The first variant is indeed more efficient than the second when the function is called either zero or one times. With the second and subsequent invocations, however, the "import every call" approach is actually less efficient. See this link for a lazy-loading technique that combines the best of both approaches by doing a "lazy import".

But there are reasons other than efficiency why you might prefer one over the other. One approach is makes it much more clear to someone reading the code as to the dependencies that this module has. They also have very different failure characteristics -- the first will fail at load time if there's no "datetime" module while the second won't fail until the method is called.

Added Note: In IronPython, imports can be quite a bit more expensive than in CPython because the code is basically being compiled as it's being imported.

韶华倾负 2024-07-13 13:40:48

以下是该问题答案的更新摘要

相关
问题。

  • PEP 8
    建议将导入放在顶部。
  • 通常更方便获得
    导入错误
    当你第一次运行你的程序时
    而不是当你的程序第一次调用你的函数时。
  • 将导入放入函数作用域中
    可以帮助避免循环导入出现问题。
  • 将导入放入函数作用域中
    帮助保持干净的模块命名空间
    这样它就不会出现在制表符补全建议中。
  • 启动时间
    函数中的导入将不会运行,直到(如果)该函数被调用。
    对于重量级库来说可能会变得很重要。
  • 尽管 import 语句在后续运行中速度非常快,
    他们仍然会受到速度损失
    如果该功能很简单但经常使用,那么这可能很重要。
  • __name__ == "__main__" 保护下的导入似乎非常合理
  • 重构
    如果导入位于函数中可能会更容易
    使用它们的地方(便于将其移动到另一个模块)。
    也可以说这有利于可读性
    然而,大多数人会持相反的观点,请参阅下一条。
  • 顶部的导入增强了可读性,
    因为您可以一目了然地看到所有依赖项
  • 似乎不清楚动态(可能条件)导入是否更喜欢一种风格而不是另一种风格。

Here's an updated summary of the answers to this
and
related
questions.

  • PEP 8
    recommends putting imports at the top.
  • It's often more convenient to get
    ImportErrors
    when you first run your program
    rather than when your program first calls your function.
  • Putting imports in the function scope
    can help avoid issues with circular imports.
  • Putting imports in the function scope
    helps keep maintain a clean module namespace,
    so that it does not appear among tab-completion suggestions.
  • Start-up time:
    imports in a function won't run until (if) that function is called.
    Might get significant with heavy-weight libraries.
  • Even though import statements are super fast on subsequent runs,
    they still incur a speed penalty
    which can be significant if the function is trivial but frequently in use.
  • Imports under the __name__ == "__main__" guard seem very reasonable.
  • Refactoring
    might be easier if the imports are located in the function
    where they're used (facilitates moving it to another module).
    It can also be argued that this is good for readability.
    However, most would argue the contrary, see the next item.
  • Imports at the top enhance readability,
    since you can see all your dependencies at a glance.
  • It seems unclear if dynamic (possibly conditional) imports favour one style over another.
笑梦风尘 2024-07-13 13:40:48

Curt 提出了一个很好的观点:第二个版本更清晰,并且会在加载时而不是稍后意外地失败。

通常我不担心加载模块的效率,因为它(a)相当快,并且(b)大多数只在启动时发生。

如果您必须在意外的时间加载重量级模块,那么使用 __import__ 函数动态加载它们可能更有意义,并且确保捕获ImportError 异常,并以合理的方式处理它们。

Curt makes a good point: the second version is clearer and will fail at load time rather than later, and unexpectedly.

Normally I don't worry about the efficiency of loading modules, since it's (a) pretty fast, and (b) mostly only happens at startup.

If you have to load heavyweight modules at unexpected times, it probably makes more sense to load them dynamically with the __import__ function, and be sure to catch ImportError exceptions, and handle them in a reasonable manner.

爱人如己 2024-07-13 13:40:48

我很惊讶没有看到已经发布的重复负载检查的实际成本数字,尽管有很多关于预期结果的很好的解释。

如果您在顶部导入,则无论如何都会承受负载。 这是相当小的,但通常以毫秒为单位,而不是纳秒。

如果您在函数内导入,则仅在首次调用其中一个函数时如果时加载。 正如许多人指出的那样,如果这种情况根本没有发生,您就可以节省加载时间。 但是,如果函数被多次调用,您会受到重复但小得多的点击(用于检查它是否已加载;而不是用于实际重新加载)。 另一方面,正如 @aaronasterling 指出的那样,您还可以节省一点,因为在函数内导入可以让函数使用稍微更快的局部变量查找来稍后识别名称(http://stackoverflow.com/questions/477096/python-import-coding-style/4789963#4789963 )。

以下是从函数内部导入一些内容的简单测试的结果。 报告的时间(在 2.3 GHz Intel Core i7 上的 Python 2.7.14 中)如下所示(第二个调用比后面的调用占用的时间似乎一致,但我不知道为什么)。

 0 foo:   14429.0924 µs
 1 foo:      63.8962 µs
 2 foo:      10.0136 µs
 3 foo:       7.1526 µs
 4 foo:       7.8678 µs
 0 bar:       9.0599 µs
 1 bar:       6.9141 µs
 2 bar:       7.1526 µs
 3 bar:       7.8678 µs
 4 bar:       7.1526 µs

代码:

from __future__ import print_function
from time import time

def foo():
    import collections
    import re
    import string
    import math
    import subprocess
    return

def bar():
    import collections
    import re
    import string
    import math
    import subprocess
    return

t0 = time()
for i in xrange(5):
    foo()
    t1 = time()
    print("    %2d foo: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
    t0 = t1
for i in xrange(5):
    bar()
    t1 = time()
    print("    %2d bar: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
    t0 = t1

I was surprised not to see actual cost numbers for the repeated load-checks posted already, although there are many good explanations of what to expect.

If you import at the top, you take the load hit no matter what. That's pretty small, but commonly in the milliseconds, not nanoseconds.

If you import within a function(s), then you only take the hit for loading if and when one of those functions is first called. As many have pointed out, if that doesn't happen at all, you save the load time. But if the function(s) get called a lot, you take a repeated though much smaller hit (for checking that it has been loaded; not for actually re-loading). On the other hand, as @aaronasterling pointed out you also save a little because importing within a function lets the function use slightly-faster local variable lookups to identify the name later (http://stackoverflow.com/questions/477096/python-import-coding-style/4789963#4789963).

Here are the results of a simple test that imports a few things from inside a function. The times reported (in Python 2.7.14 on a 2.3 GHz Intel Core i7) are shown below (the 2nd call taking more than later calls seems consistent, though I don't know why).

 0 foo:   14429.0924 µs
 1 foo:      63.8962 µs
 2 foo:      10.0136 µs
 3 foo:       7.1526 µs
 4 foo:       7.8678 µs
 0 bar:       9.0599 µs
 1 bar:       6.9141 µs
 2 bar:       7.1526 µs
 3 bar:       7.8678 µs
 4 bar:       7.1526 µs

The code:

from __future__ import print_function
from time import time

def foo():
    import collections
    import re
    import string
    import math
    import subprocess
    return

def bar():
    import collections
    import re
    import string
    import math
    import subprocess
    return

t0 = time()
for i in xrange(5):
    foo()
    t1 = time()
    print("    %2d foo: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
    t0 = t1
for i in xrange(5):
    bar()
    t1 = time()
    print("    %2d bar: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
    t0 = t1
迟到的我 2024-07-13 13:40:48

我不会过多担心预先加载模块的效率。 模块占用的内存不会很大(假设它足够模块化)并且启动成本可以忽略不计。

在大多数情况下,您希望加载源文件顶部的模块。 对于阅读你的代码的人来说,它可以更容易地判断哪个函数或对象来自哪个模块。

在代码中的其他位置导入模块的一个很好的理由是它是否在调试语句中使用。

例如:

do_something_with_x(x)

我可以这样调试:

from pprint import pprint
pprint(x)
do_something_with_x(x)

当然,在代码中其他地方导入模块的另一个原因是如果您需要动态导入它们。 这是因为你几乎别无选择。

我不会过多担心预先加载模块的效率。 模块占用的内存不会很大(假设它足够模块化)并且启动成本可以忽略不计。

I wouldn't worry about the efficiency of loading the module up front too much. The memory taken up by the module won't be very big (assuming it's modular enough) and the startup cost will be negligible.

In most cases you want to load the modules at the top of the source file. For somebody reading your code, it makes it much easier to tell what function or object came from what module.

One good reason to import a module elsewhere in the code is if it's used in a debugging statement.

For example:

do_something_with_x(x)

I could debug this with:

from pprint import pprint
pprint(x)
do_something_with_x(x)

Of course, the other reason to import modules elsewhere in the code is if you need to dynamically import them. This is because you pretty much don't have any choice.

I wouldn't worry about the efficiency of loading the module up front too much. The memory taken up by the module won't be very big (assuming it's modular enough) and the startup cost will be negligible.

萤火眠眠 2024-07-13 13:40:48

这是一个只有程序员才能决定的权衡。

情况 1 通过在需要时不导入 datetime 模块(并执行它可能需要的任何初始化)来节省一些内存和启动时间。 请注意,“仅在调用时”执行导入也意味着“每次调用时”执行导入,因此第一个调用之后的每个调用仍然会产生执行导入的额外开销。

情况 2 通过预先导入日期时间来节省一些执行时间和延迟,以便 not_often_used() 在调用时更快地返回,并且不会在每次调用时产生导入开销。

除了效率之外,如果导入语句是......在前面,则更容易预先看到模块依赖关系。 将它们隐藏在代码中可能会导致更难以轻松找到某些模块所依赖的模块。

就我个人而言,我通常遵循 PEP,除了单元测试之类的东西,这样我不想总是加载,因为我知道除了测试代码之外它们不会被使用。

It's a tradeoff, that only the programmer can decide to make.

Case 1 saves some memory and startup time by not importing the datetime module (and doing whatever initialization it might require) until needed. Note that doing the import 'only when called' also means doing it 'every time when called', so each call after the first one is still incurring the additional overhead of doing the import.

Case 2 save some execution time and latency by importing datetime beforehand so that not_often_called() will return more quickly when it is called, and also by not incurring the overhead of an import on every call.

Besides efficiency, it's easier to see module dependencies up front if the import statements are ... up front. Hiding them down in the code can make it more difficult to easily find what modules something depends on.

Personally I generally follow the PEP except for things like unit tests and such that I don't want always loaded because I know they aren't going to be used except for test code.

花辞树 2024-07-13 13:40:48

这是一个示例,其中所有导入都位于最顶部(这是我唯一一次需要这样做)。 我希望能够终止 Un*x 和 Windows 上的子进程。

import os
# ...
try:
    kill = os.kill  # will raise AttributeError on Windows
    from signal import SIGTERM
    def terminate(process):
        kill(process.pid, SIGTERM)
except (AttributeError, ImportError):
    try:
        from win32api import TerminateProcess  # use win32api if available
        def terminate(process):
            TerminateProcess(int(process._handle), -1)
    except ImportError:
        def terminate(process):
            raise NotImplementedError  # define a dummy function

(回顾:约翰米利金说。)

Here's an example where all the imports are at the very top (this is the only time I've needed to do this). I want to be able to terminate a subprocess on both Un*x and Windows.

import os
# ...
try:
    kill = os.kill  # will raise AttributeError on Windows
    from signal import SIGTERM
    def terminate(process):
        kill(process.pid, SIGTERM)
except (AttributeError, ImportError):
    try:
        from win32api import TerminateProcess  # use win32api if available
        def terminate(process):
            TerminateProcess(int(process._handle), -1)
    except ImportError:
        def terminate(process):
            raise NotImplementedError  # define a dummy function

(On review: what John Millikin said.)

梦太阳 2024-07-13 13:40:48

这就像许多其他优化一样——为了速度而牺牲了一些可读性。 正如约翰提到的,如果您已经完成了分析作业并发现这是一个非常有用的更改并且您需要额外的速度,那么就去做吧。 与所有其他导入一起添加注释可能会很好:

from foo import bar
from baz import qux
# Note: datetime is imported in SomeClass below

This is like many other optimizations - you sacrifice some readability for speed. As John mentioned, if you've done your profiling homework and found this to be a significantly useful enough change and you need the extra speed, then go for it. It'd probably be good to put a note up with all the other imports:

from foo import bar
from baz import qux
# Note: datetime is imported in SomeClass below
紫轩蝶泪 2024-07-13 13:40:48

模块初始化仅发生一次 - 第一次导入时。 如果相关模块来自标准库,那么您也可能会从程序中的其他模块导入它。 对于像 datetime 这样流行的模块来说,它也可能是许多其他标准库的依赖项。 由于模块初始化已经发生,因此导入语句的成本非常低。 此时它所做的就是将现有模块对象绑定到本地范围。

将这些信息与可读性参数结合起来,我想说最好在模块范围内使用 import 语句。

Module initialization only occurs once - on the first import. If the module in question is from the standard library, then you will likely import it from other modules in your program as well. For a module as prevalent as datetime, it is also likely a dependency for a slew of other standard libraries. The import statement would cost very little then since the module intialization would have happened already. All it is doing at this point is binding the existing module object to the local scope.

Couple that information with the argument for readability and I would say that it is best to have the import statement at module scope.

野の 2024-07-13 13:40:48

只是为了完成 Moe 的回答 和原始问题:

当我们必须处理循环依赖时,我们可以做一些“假设我们正在使用包含 x() 和 b < 的模块 a.pyb.py分别为 code>y()。 然后:

  1. 我们可以将其中一个 from import 移动到模块底部。
  2. 我们可以将其中一个 from import 移动到实际需要导入的函数或方法中(这并不总是可行,因为您可以从多个位置使用它)。
  3. 我们可以将两个 from import 之一更改为如下所示的导入: import a

所以,总结一下。 如果您不处理循环依赖项并采取某种技巧来避免它们,那么最好将所有导入放在顶部,因为这个问题的其他答案中已经解释了原因。 请在执行此“技巧”时发表评论,我们总是欢迎的! :)

Just to complete Moe's answer and the original question:

When we have to deal with circular dependences we can do some "tricks". Assuming we're working with modules a.py and b.py that contain x() and b y(), respectively. Then:

  1. We can move one of the from imports at the bottom of the module.
  2. We can move one of the from imports inside the function or method that is actually requiring the import (this isn't always possible, as you may use it from several places).
  3. We can change one of the two from imports to be an import that looks like: import a

So, to conclude. If you aren't dealing with circular dependencies and doing some kind of trick to avoid them, then it's better to put all your imports at the top because of the reasons already explained in other answers to this question. And please, when doing this "tricks" include a comment, it's always welcome! :)

¢好甜 2024-07-13 13:40:48

除了已经给出的出色答案之外,值得注意的是导入的放置不仅仅是风格问题。 有时,模块具有需要首先导入或初始化的隐式依赖项,并且顶级导入可能会导致违反所需的执行顺序。

此问题经常出现在 Apache Spark 的 Python API 中,您需要在导入任何 pyspark 包或模块之前初始化 SparkContext。 最好将 pyspark 导入放置在保证 SparkContext 可用的范围内。

In addition to the excellent answers already given, it's worth noting that the placement of imports is not merely a matter of style. Sometimes a module has implicit dependencies that need to be imported or initialized first, and a top-level import could lead to violations of the required order of execution.

This issue often comes up in Apache Spark's Python API, where you need to initialize the SparkContext before importing any pyspark packages or modules. It's best to place pyspark imports in a scope where the SparkContext is guaranteed to be available.

深爱不及久伴 2024-07-13 13:40:48

我不想提供完整的答案,因为其他人已经做得很好了。 当我发现在函数内部导入模块特别有用时,我只想提一个用例。 我的应用程序使用存储在特定位置的 python 包和模块作为插件。 在应用程序启动期间,应用程序会遍历该位置中的所有模块并导入它们,然后它会查看模块内部,如果找到插件的一些安装点(在我的例子中,它是某个基类的子类,具有唯一的ID)它注册它们。 插件的数量很大(现在有几十个,但将来可能有数百个),而且每个插件都很少使用。 在应用程序启动期间,在我的插件模块顶部导入第三方库会造成一些损失。 特别是一些第三方库的导入很繁重(例如,plotly 的导入甚至尝试连接到互联网并下载一些东西,这会增加大约一秒钟的启动时间)。 通过优化插件中的导入(仅在使用它们的函数中调用它们),我设法将启动时间从 10 秒缩短到大约 2 秒。 这对我的用户来说是一个很大的区别。

所以我的答案是否定的,不要总是将导入放在模块的顶部。

I do not aspire to provide complete answer, because others have already done this very well. I just want to mention one use case when I find especially useful to import modules inside functions. My application uses python packages and modules stored in certain location as plugins. During application startup, the application walks through all the modules in the location and imports them, then it looks inside the modules and if it finds some mounting points for the plugins (in my case it is a subclass of a certain base class having a unique ID) it registers them. The number of plugins is large (now dozens, but maybe hundreds in the future) and each of them is used quite rarely. Having imports of third party libraries at the top of my plugin modules was a bit penalty during application startup. Especially some thirdparty libraries are heavy to import (e.g. import of plotly even tries to connect to internet and download something which was adding about one second to startup). By optimizing imports (calling them only in the functions where they are used) in the plugins I managed to shrink the startup from 10 seconds to some 2 seconds. That is a big difference for my users.

So my answer is no, do not always put the imports at the top of your modules.

蒲公英的约定 2024-07-13 13:40:48

有趣的是,到目前为止,没有一个答案提到并行处理,当序列化函数代码被推送到其他核心时,可能需要导入在函数中,例如在 ipyparallel 的情况下。

It's interesting that not a single answer mentioned parallel processing so far, where it might be REQUIRED that the imports are in the function, when the serialized function code is what is being pushed around to other cores, e.g. like in the case of ipyparallel.

我三岁 2024-07-13 13:40:48

通过在函数内部导入变量/局部作用域可以提高性能。 这取决于函数内导入的东西的用法。 如果您多次循环并访问模块全局对象,将其导入为本地对象会有所帮助。

test.py

X=10
Y=11
Z=12
def add(i):
  i = i + 10

runlocal.py

from test import add, X, Y, Z

    def callme():
      x=X
      y=Y
      z=Z
      ladd=add 
      for i  in range(100000000):
        ladd(i)
        x+y+z

    callme()

run.py

from test import add, X, Y, Z

def callme():
  for i in range(100000000):
    add(i)
    X+Y+Z

callme()

Linux 上的时间显示了一个小增益,

/usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python run.py 
    0:17.80 real,   17.77 user, 0.01 sys
/tmp/test$ /usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python runlocal.py 
    0:14.23 real,   14.22 user, 0.01 sys

实际是挂钟。 用户是程序中的时间。 sys 是系统调用的时间。

https://docs.python.org/3.5/reference/ executionmodel.html#名称解析

There can be a performance gain by importing variables/local scoping inside of a function. This depends on the usage of the imported thing inside the function. If you are looping many times and accessing a module global object, importing it as local can help.

test.py

X=10
Y=11
Z=12
def add(i):
  i = i + 10

runlocal.py

from test import add, X, Y, Z

    def callme():
      x=X
      y=Y
      z=Z
      ladd=add 
      for i  in range(100000000):
        ladd(i)
        x+y+z

    callme()

run.py

from test import add, X, Y, Z

def callme():
  for i in range(100000000):
    add(i)
    X+Y+Z

callme()

A time on Linux shows a small gain

/usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python run.py 
    0:17.80 real,   17.77 user, 0.01 sys
/tmp/test$ /usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python runlocal.py 
    0:14.23 real,   14.22 user, 0.01 sys

real is wall clock. user is time in program. sys is time for system calls.

https://docs.python.org/3.5/reference/executionmodel.html#resolution-of-names

唐婉 2024-07-13 13:40:48

可读性

除了启动性能之外,本地化 import 语句还需要考虑可读性。 例如,在我当前的第一个 python 项目中,使用 python 行号 1283 到 1296:

listdata.append(['tk font version', font_version])
listdata.append(['Gtk version', str(Gtk.get_major_version())+"."+
                 str(Gtk.get_minor_version())+"."+
                 str(Gtk.get_micro_version())])

import xml.etree.ElementTree as ET

xmltree = ET.parse('/usr/share/gnome/gnome-version.xml')
xmlroot = xmltree.getroot()
result = []
for child in xmlroot:
    result.append(child.text)
listdata.append(['Gnome version', result[0]+"."+result[1]+"."+
                 result[2]+" "+result[3]])

如果 import 语句位于文件顶部,我将不得不向上滚动很长一段距离,或者按 Home< /kbd>,找出 ET 是什么。 然后我必须导航回第 1283 行才能继续阅读代码。

事实上,即使 import 语句位于函数(或类)的顶部(许多人会将其放置在函数(或类)的顶部),也需要向上和向下分页。

显示 Gnome 版本号的情况很少发生,因此文件顶部的 import 会引入不必要的启动延迟。

Readability

In addition to startup performance, there is a readability argument to be made for localizing import statements. For example take python line numbers 1283 through 1296 in my current first python project:

listdata.append(['tk font version', font_version])
listdata.append(['Gtk version', str(Gtk.get_major_version())+"."+
                 str(Gtk.get_minor_version())+"."+
                 str(Gtk.get_micro_version())])

import xml.etree.ElementTree as ET

xmltree = ET.parse('/usr/share/gnome/gnome-version.xml')
xmlroot = xmltree.getroot()
result = []
for child in xmlroot:
    result.append(child.text)
listdata.append(['Gnome version', result[0]+"."+result[1]+"."+
                 result[2]+" "+result[3]])

If the import statement was at the top of file I would have to scroll up a long way, or press Home, to find out what ET was. Then I would have to navigate back to line 1283 to continue reading code.

Indeed even if the import statement was at the top of the function (or class) as many would place it, paging up and back down would be required.

Displaying the Gnome version number will rarely be done so the import at top of file introduces unnecessary startup lag.

櫻之舞 2024-07-13 13:40:48

我想提一下我的一个用例,与 @John Millikin 和 @VK 提到的用例非常相似:

可选导入

我使用 Jupyter Notebook 进行数据分析,并使用相同的 IPython Notebook 作为所有分析的模板。 在某些情况下,我需要导入 Tensorflow 来进行一些快速模型运行,但有时我在 Tensorflow 未设置/导入缓慢的地方工作。 在这些情况下,我将依赖于 Tensorflow 的操作封装在辅助函数中,在该函数内导入 Tensorflow,并将其绑定到按钮。

这样,我可以执行“重新启动并运行全部”,而不必等待导入,或者在失败时必须恢复其余单元。

I would like to mention a usecase of mine, very similar to those mentioned by @John Millikin and @V.K.:

Optional Imports

I do data analysis with Jupyter Notebook, and I use the same IPython notebook as a template for all analyses. In some occasions, I need to import Tensorflow to do some quick model runs, but sometimes I work in places where tensorflow isn't set up / is slow to import. In those cases, I encapsulate my Tensorflow-dependent operations in a helper function, import tensorflow inside that function, and bind it to a button.

This way, I could do "restart-and-run-all" without having to wait for the import, or having to resume the rest of the cells when it fails.

卷耳 2024-07-13 13:40:48

虽然 PEP 鼓励在模块顶部导入,但在其他级别导入并不是错误。 这表明进口应该位于顶部,但也有例外。

在使用模块时加载模块是一个微优化。 如果导入缓慢的代码产生了很大的差异,则可以稍后进行优化。

尽管如此,您仍然可以在尽可能靠近顶部的位置引入有条件导入的标志,允许用户使用配置来导入他们需要的模块,同时仍然立即导入所有内容。

尽快导入意味着如果任何导入(或导入的导入)丢失或存在语法错误,程序将失败。 如果所有导入都发生在所有模块的顶部,那么 python 的工作分两步进行。 编译。 跑步。

内置模块可以在任何导入的地方使用,因为它们设计精良。 您编写的模块应该是相同的。 将导入移动到顶部或首次使用可以帮助确保没有副作用,并且代码正在注入依赖项。

无论您是否将导入放在顶部,当导入位于顶部时,您的代码仍然应该有效。 因此,首先立即导入,然后根据需要进行优化。

While PEP encourages importing at the top of a module, it isn't an error to import at other levels. That indicates imports should be at the top, however there are exceptions.

It is a micro-optimization to load modules when they are used. Code that is sluggish importing can be optimized later if it makes a sizable difference.

Still, you might introduce flags to conditionally import at as near to the top as possible, allowing a user to use configuration to import the modules they need while still importing everything immediately.

Importing as soon as possible means the program will fail if any imports (or imports of imports) are missing or have syntax errors. If all imports occur at the top of all modules then python works in two steps. Compile. Run.

Built in modules work anywhere they are imported because they are well designed. Modules you write should be the same. Moving around your imports to the top or to their first use can help ensure there are no side effects and the code is injecting dependencies.

Whether you put imports at the top or not, your code should still work when the imports are at the top. So start by importing immediately then optimize as needed.

逆光下的微笑 2024-07-13 13:40:48

这是一个有趣的讨论。 像许多其他人一样,我什至从未考虑过这个话题。 由于想在我的一个库中使用 Django ORM,我被迫在函数中进行导入。 在导入模型类之前,我必须调用 django.setup() ,并且由于 IoC 注入器构造,它位于文件顶部,因此被拖入完全非 Django 库代码中。

我做了一些修改,最终将 django.setup() 放入单例构造函数中,并将相关导入放在每个类方法的顶部。 现在,这工作正常,但让我感到不安,因为导入不是在顶部,而且我开始担心导入的额外时间影响。 然后我来到这里,饶有兴趣地阅读了每个人对此的看法。

我有很长的 C++ 背景,现在使用 Python/Cython。 我对此的看法是,为什么不将导入放入函数中,除非它会导致您遇到瓶颈。 这就像在需要变量之前声明变量空间一样。 问题是我有数千行代码,所有导入都位于顶部! 所以我想我会从现在开始做,当我路过并且有时间的时候,到处改变奇怪的文件。

This is a fascinating discussion. Like many others I had never even considered this topic. I got cornered into having to have the imports in the functions because of wanting to use the Django ORM in one of my libraries. I was having to call django.setup() before importing my model classes and because this was at the top of the file it was being dragged into completely non-Django library code because of the IoC injector construction.

I kind of hacked around a bit and ended up putting the django.setup() in the singleton constructor and the relevant import at the top of each class method. Now this worked fine but made me uneasy because the imports weren't at the top and also I started worrying about the extra time hit of the imports. Then I came here and read with great interest everybody's take on this.

I have a long C++ background and now use Python/Cython. My take on this is that why not put the imports in the function unless it causes you a profiled bottleneck. It's only like declaring space for variables just before you need them. The trouble is I have thousands of lines of code with all the imports at the top! So I think I will do it from now on and change the odd file here and there when I'm passing through and have the time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文