导入内部函数是Pythonic吗?

发布于 2024-07-25 22:03:33 字数 736 浏览 7 评论 0原文

PEP 8 说:

  • 导入始终放在文件顶部,紧接在任何模块之后 注释和文档字符串,以及模块全局变量和常量之前。

有时,我会违反 PEP 8。有时我会在函数内部导入内容。 作为一般规则,如果存在仅在单个函数中使用的导入,我会这样做。

有什么意见吗?

编辑(我觉得导入函数是个好主意的原因):

主要原因:它可以使代码更清晰。

  • 当查看函数的代码时,我可能会问自己:“什么是函数/类 xxx?” (xxx 在函数内部使用)。 如果我所有的导入都在模块的顶部,我必须去那里看看以确定 xxx 是什么。 使用 from m import xxx 时,这会是一个更大的问题。 在函数中看到 m.xxx 可能会告诉我更多信息。 取决于 m 是什么:它是众所周知的顶级模块/包(import m)吗? 或者它是一个子模块/包(from abc import m)?
  • 在某些情况下,在使用 xxx 的地方附近提供额外信息(“xxx 是什么?”)可以使函数更容易理解。

PEP 8 says:

  • Imports are always put at the top of the file, just after any module
    comments and docstrings, and before module globals and constants.

On occation, I violate PEP 8. Some times I import stuff inside functions. As a general rule, I do this if there is an import that is only used within a single function.

Any opinions?

EDIT (the reason I feel importing in functions can be a good idea):

Main reason: It can make the code clearer.

  • When looking at the code of a function I might ask myself: "What is function/class xxx?" (xxx being used inside the function). If I have all my imports at the top of the module, I have to go look there to determine what xxx is. This is more of an issue when using from m import xxx. Seeing m.xxx in the function probably tells me more. Depending on what m is: Is it a well-known top-level module/package (import m)? Or is it a sub-module/package (from a.b.c import m)?
  • In some cases having that extra information ("What is xxx?") close to where xxx is used can make the function easier to understand.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

南冥有猫 2024-08-01 22:03:33

从长远来看,我认为您会喜欢将大部分导入放在文件顶部,这样您就可以根据需要导入的内容一眼看出模块的复杂程度。

如果我要向现有文件添加新代码,我通常会在需要的地方进行导入,然后如果代码保留,我会通过将导入行移动到文件顶部来使事情变得更加持久。

另一点,我更喜欢在运行任何代码之前得到一个 ImportError 异常 - 作为健全性检查,所以这是在顶部导入的另一个原因。

您可以使用 linter 检查未使用的模块。

In the long run I think you'll appreciate having most of your imports at the top of the file, that way you can tell at a glance how complicated your module is by what it needs to import.

If I'm adding new code to an existing file I'll usually do the import where it's needed and then if the code stays I'll make things more permanent by moving the import line to the top of the file.

One other point, I prefer to get an ImportError exception before any code is run — as a sanity check, so that's another reason to import at the top.

You can use a linter to check for unused modules.

深空失忆 2024-08-01 22:03:33

在这方面,我有两种情况违反了 PEP 8:

  • 循环导入:模块 A 导入模块 B,但模块 B 中的某些内容需要模块 A(尽管这通常表明我需要重构模块以消除循环依赖)
  • 插入pdb断点:import pdb; pdb.set_trace() 这很方便,因为我不想将 import pdb 放在我可能想要调试的每个模块的顶部,并且很容易记住删除当我删除断点时导入。

除了这两种情况之外,最好将所有内容都放在顶部。 它使依赖关系更加清晰。

There are two occasions where I violate PEP 8 in this regard:

  • Circular imports: module A imports module B, but something in module B needs module A (though this is often a sign that I need to refactor the modules to eliminate the circular dependency)
  • Inserting a pdb breakpoint: import pdb; pdb.set_trace() This is handy b/c I don't want to put import pdb at the top of every module I might want to debug, and it easy to remember to remove the import when I remove the breakpoint.

Outside of these two cases, it's a good idea to put everything at the top. It makes the dependencies clearer.

ぃ双果 2024-08-01 22:03:33

使用

import(以及 from x import yimport x as y)的四个导入用例

  1. 以下是我们在顶部

  2. 导入选项。 在顶部。

    导入设置 
      如果设置。某事: 
          将其导入为 foo 
      别的: 
          将其导入为 foo 
      
  3. 有条件导入。 与 JSON、XML 库等一起使用。 在顶部。

    <前><代码>尝试:
    将其导入为 foo
    除了导入错误:
    将其导入为 foo

  4. 动态导入。 到目前为止,我们只有一个例子。

    导入设置 
      模块内容 = {} 
      模块= __import__( 设置.some_module, module_stuff ) 
      x = module_stuff['x'] 
      

    请注意,此动态导入不会引入代码,但会引入复杂的
    用 Python 编写的数据结构。 这有点像一段腌制的数据
    除了我们手工腌制它。

    这或多或少也位于模块的顶部


为了使代码更清晰,我们采取了以下措施:

  • 保持模块简短。

  • 如果我的所有导入都位于模块的顶部,我必须去那里查看以确定名称是什么。 如果模块很短,那很容易做到。

  • 在某些情况下,在名称使用位置附近提供额外信息可以使函数更易于理解。 如果模块很短,那很容易做到。

Here are the four import use cases that we use

  1. import (and from x import y and import x as y) at the top

  2. Choices for Import. At the top.

    import settings
    if setting.something:
        import this as foo
    else:
        import that as foo
    
  3. Conditional Import. Used with JSON, XML libraries and the like. At the top.

    try:
        import this as foo
    except ImportError:
        import that as foo
    
  4. Dynamic Import. So far, we only have one example of this.

    import settings
    module_stuff = {}
    module= __import__( settings.some_module, module_stuff )
    x = module_stuff['x']
    

    Note that this dynamic import doesn't bring in code, but brings in complex
    data structures written in Python. It's kind of like a pickled piece of data
    except we pickled it by hand.

    This is also, more-or-less, at the top of a module


Here's what we do to make the code clearer:

  • Keep the modules short.

  • If I have all my imports at the top of the module, I have to go look there to determine what a name is. If the module is short, that's easy to do.

  • In some cases having that extra information close to where a name is used can make the function easier to understand. If the module is short, that's easy to do.

℡寂寞咖啡 2024-08-01 22:03:33

需要记住的一件事:不必要的导入可能会导致性能问题。 因此,如果这是一个经常调用的函数,那么最好将导入放在顶部。 当然,这是一种优化,因此如果有一个有效的案例表明在函数内部导入比在文件顶部导入更清晰,那么在大多数情况下这会胜过性能。

如果您正在使用 IronPython,我被告知最好导入内部函数(因为在 IronPython 中编译代码可能会很慢)。 因此,您也许能够找到导入内部函数的方法。 但除此之外,我认为不值得违背惯例。

作为一般规则,如果存在仅在单个函数中使用的导入,我会执行此操作。

我想指出的另一点是,这可能是一个潜在的维护问题。 如果您添加的函数使用了之前仅由一个函数使用的模块,会发生什么情况? 您是否会记得将导入添加到文件顶部? 或者您要扫描每个函数以进行导入吗?

FWIW,在某些情况下,在函数内部导入是有意义的。 例如,如果要在 cx_Oracle 中设置语言,则需要在导入之前设置 NLS_LANG 环境变量。 因此,您可能会看到如下代码:

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle

One thing to bear in mind: needless imports can cause performance problems. So if this is a function that will be called frequently, you're better off just putting the import at the top. Of course this is an optimization, so if there's a valid case to be made that importing inside a function is more clear than importing at the top of a file, that trumps performance in most cases.

If you're doing IronPython, I'm told that it's better to import inside functions (since compiling code in IronPython can be slow). Thus, you may be able to get a way with importing inside functions then. But other than that, I'd argue that it's just not worth it to fight convention.

As a general rule, I do this if there is an import that is only used within a single function.

Another point I'd like to make is that this may be a potential maintenence problem. What happens if you add a function that uses a module that was previously used by only one function? Are you going to remember to add the import to the top of the file? Or are you going to scan each and every function for imports?

FWIW, there are cases where it makes sense to import inside a function. For example, if you want to set the language in cx_Oracle, you need to set an NLS_LANG environment variable before it is imported. Thus, you may see code like this:

import os

oracle = None

def InitializeOracle(lang):
    global oracle
    os.environ['NLS_LANG'] = lang
    import cx_Oracle
    oracle = cx_Oracle
咿呀咿呀哟 2024-08-01 22:03:33

对于自测试的模块,我之前就打破过这个规则。 也就是说,它们通常只是用于支持,但我为它们定义了一个 main,这样如果您自己运行它们,就可以测试它们的功能。 在这种情况下,我有时只在 main 中导入 getoptcmd ,因为我希望阅读代码的人清楚这些模块与正常操作无关模块的并且仅包含用于测试。

I've broken this rule before for modules that are self-testing. That is, they are normally just used for support, but I define a main for them so that if you run them by themselves you can test their functionality. In that case I sometimes import getopt and cmd just in main, because I want it to be clear to someone reading the code that these modules have nothing to do with the normal operation of the module and are only being included for testing.

话少心凉 2024-08-01 22:03:33

来自关于加载模块两次的问题 -为什么不兼得?

脚本顶部的导入将指示依赖关系,而函数中的另一个导入使该函数更加原子化,同时似乎不会造成任何性能劣势,因为连续导入很便宜。

Coming from the question about loading the module twice - Why not both?

An import at the top of the script will indicate the dependencies and another import in the function with make this function more atomic, while seemingly not causing any performance disadvantage, since a consecutive import is cheap.

从此见与不见 2024-08-01 22:03:33

看一下 sqlalchemy 中使用的替代方法:依赖注入:

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

注意如何在装饰器中声明导入的库,并将作为参数传递给函数

这种方法使代码更加简洁,并且比 import 语句运行速度4.5 倍

基准:https://gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796

Have a look at the alternative approach that's used in sqlalchemy: dependency injection:

@util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
    #...
    query.Query(...)

Notice how the imported library is declared in a decorator, and passed as an argument to the function!

This approach makes the code cleaner, and also works 4.5 times faster than an import statement!

Benchmark: https://gist.github.com/kolypto/589e84fbcfb6312532658df2fabdb796

凶凌 2024-08-01 22:03:33

还有另一种(可能是“角落”)情况,在很少使用的函数中导入可能会有好处:缩短启动时间。

我曾经遇到过这个问题,在小型物联网服务器上运行一个相当复杂的程序,该程序从串行线路接受命令并执行操作,可能是非常复杂的操作。

import 语句放在文件顶部,意味着在服务器启动之前处理所有导入; 因为 import 列表中包括 jinja2lxmlsignxml 等“重量级”(而且 SoC 的功能不是很强大) )这意味着第一条指令实际执行前的分钟

OTOH 将大多数导入放入函数中,我能够在几秒钟内让服务器在串行线上“活动”。 当然,当实际需要模块时,我必须付出代价(注意:这也可以通过在空闲时间生成执行导入的后台任务来缓解)。

There's another (probably "corner") case where it may be beneficial to import inside rarely used functions: shorten startup time.

I hit that wall once with a rather complex program running on a small IoT server accepting commands from a serial line and performing operations, possibly very complex operations.

Placing import statements at top of files meant to have all imports processed before server start; since import list included jinja2, lxml, signxml and other "heavy weights" (and SoC was not very powerful) this meant minutes before the first instruction was actually executed.

OTOH placing most imports in functions I was able to have the server "alive" on the serial line in seconds. Of course when the modules were actually needed I had to pay the price (Note: also this can be mitigated by spawning a background task doing imports in idle time).

心房的律动 2024-08-01 22:03:33

只要是 import 而不是 from x import *,就应该将它们放在顶部。 它只向全局命名空间添加一个名称,并且您可以坚持使用 PEP 8。另外,如果您以后在其他地方需要它,则无需移动任何内容。

这没什么大不了的,但由于几乎没有区别,我建议按照 PEP 8 的说明进行操作。

As long as it's import and not from x import *, you should put them at the top. It adds just one name to the global namespace, and you stick to PEP 8. Plus, if you later need it somewhere else, you don't have to move anything around.

It's no big deal, but since there's almost no difference I'd suggest doing what PEP 8 says.

圈圈圆圆圈圈 2024-08-01 22:03:33

在既是“普通”模块又可以执行的模块中(即有一个 if __name__ == '__main__':-部分),我通常导入仅在执行模块内的模块时使用的模块。主要部分。

例子:

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()

In modules that are both 'normal' modules and can be executed (i.e. have a if __name__ == '__main__':-section), I usually import modules that are only used when executing the module inside the main section.

Example:

def really_useful_function(data):
    ...


def main():
    from pathlib import Path
    from argparse import ArgumentParser
    from dataloader import load_data_from_directory

    parser = ArgumentParser()
    parser.add_argument('directory')
    args = parser.parse_args()
    data = load_data_from_directory(Path(args.directory))
    print(really_useful_function(data)


if __name__ == '__main__':
    main()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文