当前位置：文江博客话题详情

导入内部函数是Pythonic吗？

发布于 2024-07-25 22:03:33 字数 736 浏览 7 评论 0原文

PEP 8 说：

导入始终放在文件顶部，紧接在任何模块之后注释和文档字符串，以及模块全局变量和常量之前。

有时，我会违反 PEP 8。有时我会在函数内部导入内容。作为一般规则，如果存在仅在单个函数中使用的导入，我会这样做。

有什么意见吗？

编辑（我觉得导入函数是个好主意的原因）：

主要原因：它可以使代码更清晰。

当查看函数的代码时，我可能会问自己：“什么是函数/类 xxx？” （xxx 在函数内部使用）。如果我所有的导入都在模块的顶部，我必须去那里看看以确定 xxx 是什么。使用 from m import xxx 时，这会是一个更大的问题。在函数中看到 m.xxx 可能会告诉我更多信息。取决于 m 是什么：它是众所周知的顶级模块/包（import m）吗？或者它是一个子模块/包（from abc import m）？
在某些情况下，在使用 xxx 的地方附近提供额外信息（“xxx 是什么？”）可以使函数更容易理解。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南冥有猫 2024-08-01 22:03:33

从长远来看，我认为您会喜欢将大部分导入放在文件顶部，这样您就可以根据需要导入的内容一眼看出模块的复杂程度。

如果我要向现有文件添加新代码，我通常会在需要的地方进行导入，然后如果代码保留，我会通过将导入行移动到文件顶部来使事情变得更加持久。

另一点，我更喜欢在运行任何代码之前得到一个 ImportError 异常 - 作为健全性检查，所以这是在顶部导入的另一个原因。

您可以使用 linter 检查未使用的模块。

回复收藏 0 原文

深空失忆 2024-08-01 22:03:33

在这方面，我有两种情况违反了 PEP 8：

循环导入：模块 A 导入模块 B，但模块 B 中的某些内容需要模块 A（尽管这通常表明我需要重构模块以消除循环依赖）
插入pdb断点：import pdb; pdb.set_trace() 这很方便，因为我不想将 import pdb 放在我可能想要调试的每个模块的顶部，并且很容易记住删除当我删除断点时导入。

除了这两种情况之外，最好将所有内容都放在顶部。它使依赖关系更加清晰。

回复收藏 0 原文

ぃ双果 2024-08-01 22:03:33

使用

import（以及 from x import y 和 import x as y）的四个导入用例

以下是我们在顶部

导入选项。在顶部。

导入设置 
  如果设置。某事： 
      将其导入为 foo 
  别的： 
      将其导入为 foo

有条件导入。与 JSON、XML 库等一起使用。在顶部。
<前><代码>尝试：
将其导入为 foo
除了导入错误：
将其导入为 foo
动态导入。到目前为止，我们只有一个例子。
```
导入设置 
  模块内容 = {} 
  模块= __import__( 设置.some_module, module_stuff ) 
  x = module_stuff['x'] 
  
```
请注意，此动态导入不会引入代码，但会引入复杂的
用 Python 编写的数据结构。这有点像一段腌制的数据
除了我们手工腌制它。
这或多或少也位于模块的顶部

为了使代码更清晰，我们采取了以下措施：

保持模块简短。
如果我的所有导入都位于模块的顶部，我必须去那里查看以确定名称是什么。如果模块很短，那很容易做到。
在某些情况下，在名称使用位置附近提供额外信息可以使函数更易于理解。如果模块很短，那很容易做到。

Here are the four import use cases that we use

import (and from x import y and import x as y) at the top

Choices for Import. At the top.

import settings
if setting.something:
    import this as foo
else:
    import that as foo

Conditional Import. Used with JSON, XML libraries and the like. At the top.
```
try:
    import this as foo
except ImportError:
    import that as foo
```
Dynamic Import. So far, we only have one example of this.
```
import settings
module_stuff = {}
module= __import__( settings.some_module, module_stuff )
x = module_stuff['x']
```
Note that this dynamic import doesn't bring in code, but brings in complex
data structures written in Python. It's kind of like a pickled piece of data
except we pickled it by hand.
This is also, more-or-less, at the top of a module

Here's what we do to make the code clearer:

Keep the modules short.
If I have all my imports at the top of the module, I have to go look there to determine what a name is. If the module is short, that's easy to do.
In some cases having that extra information close to where a name is used can make the function easier to understand. If the module is short, that's easy to do.

回复收藏 0 原文

℡寂寞咖啡 2024-08-01 22:03:33

需要记住的一件事：不必要的导入可能会导致性能问题。因此，如果这是一个经常调用的函数，那么最好将导入放在顶部。当然，这是一种优化，因此如果有一个有效的案例表明在函数内部导入比在文件顶部导入更清晰，那么在大多数情况下这会胜过性能。

如果您正在使用 IronPython，我被告知最好导入内部函数（因为在 IronPython 中编译代码可能会很慢）。因此，您也许能够找到导入内部函数的方法。但除此之外，我认为不值得违背惯例。

作为一般规则，如果存在仅在单个函数中使用的导入，我会执行此操作。

我想指出的另一点是，这可能是一个潜在的维护问题。如果您添加的函数使用了之前仅由一个函数使用的模块，会发生什么情况？您是否会记得将导入添加到文件顶部？或者您要扫描每个函数以进行导入吗？

FWIW，在某些情况下，在函数内部导入是有意义的。例如，如果要在 cx_Oracle 中设置语言，则需要在导入之前设置 NLS_LANG 环境变量。因此，您可能会看到如下代码：

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle

One thing to bear in mind: needless imports can cause performance problems. So if this is a function that will be called frequently, you're better off just putting the import at the top. Of course this is an optimization, so if there's a valid case to be made that importing inside a function is more clear than importing at the top of a file, that trumps performance in most cases.

If you're doing IronPython, I'm told that it's better to import inside functions (since compiling code in IronPython can be slow). Thus, you may be able to get a way with importing inside functions then. But other than that, I'd argue that it's just not worth it to fight convention.

As a general rule, I do this if there is an import that is only used within a single function.

Another point I'd like to make is that this may be a potential maintenence problem. What happens if you add a function that uses a module that was previously used by only one function? Are you going to remember to add the import to the top of the file? Or are you going to scan each and every function for imports?

FWIW, there are cases where it makes sense to import inside a function. For example, if you want to set the language in cx_Oracle, you need to set an NLS_LANG environment variable before it is imported. Thus, you may see code like this:

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle

回复收藏 0 原文

咿呀咿呀哟 2024-08-01 22:03:33

对于自测试的模块，我之前就打破过这个规则。也就是说，它们通常只是用于支持，但我为它们定义了一个 main，这样如果您自己运行它们，就可以测试它们的功能。在这种情况下，我有时只在 main 中导入 getopt 和 cmd ，因为我希望阅读代码的人清楚这些模块与正常操作无关模块的并且仅包含用于测试。

回复收藏 0 原文

话少心凉 2024-08-01 22:03:33

来自关于加载模块两次的问题 -为什么不兼得？

脚本顶部的导入将指示依赖关系，而函数中的另一个导入使该函数更加原子化，同时似乎不会造成任何性能劣势，因为连续导入很便宜。

回复收藏 0 原文

从此见与不见 2024-08-01 22:03:33

看一下 sqlalchemy 中使用的替代方法：依赖注入：

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

注意如何在装饰器中声明导入的库，并将作为参数传递给函数！

这种方法使代码更加简洁，并且比 import 语句运行速度4.5 倍！

基准：https://gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796

Have a look at the alternative approach that's used in sqlalchemy: dependency injection:

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

Notice how the imported library is declared in a decorator, and passed as an argument to the function!

This approach makes the code cleaner, and also works 4.5 times faster than an import statement!

Benchmark: https://gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796

回复收藏 0 原文

凶凌 2024-08-01 22:03:33

还有另一种（可能是“角落”）情况，在很少使用的函数中导入可能会有好处：缩短启动时间。

我曾经遇到过这个问题，在小型物联网服务器上运行一个相当复杂的程序，该程序从串行线路接受命令并执行操作，可能是非常复杂的操作。

将 import 语句放在文件顶部，意味着在服务器启动之前处理所有导入；因为 import 列表中包括 jinja2、lxml、signxml 等“重量级”（而且 SoC 的功能不是很强大））这意味着第一条指令实际执行前的分钟。

OTOH 将大多数导入放入函数中，我能够在几秒钟内让服务器在串行线上“活动”。当然，当实际需要模块时，我必须付出代价（注意：这也可以通过在空闲时间生成执行导入的后台任务来缓解）。

回复收藏 0 原文

心房的律动 2024-08-01 22:03:33

只要是 import 而不是 from x import *，就应该将它们放在顶部。它只向全局命名空间添加一个名称，并且您可以坚持使用 PEP 8。另外，如果您以后在其他地方需要它，则无需移动任何内容。

这没什么大不了的，但由于几乎没有区别，我建议按照 PEP 8 的说明进行操作。

回复收藏 0 原文

圈圈圆圆圈圈 2024-08-01 22:03:33

在既是“普通”模块又可以执行的模块中（即有一个 if __name__ == '__main__':-部分），我通常导入仅在执行模块内的模块时使用的模块。主要部分。

例子：

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()

In modules that are both 'normal' modules and can be executed (i.e. have a if __name__ == '__main__':-section), I usually import modules that are only used when executing the module inside the main section.

Example:

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()

回复收藏 0 原文

~没有更多了~