Python 是否优化循环中的函数调用？

发布于 2024-12-01 20:49:51 字数 355 浏览 2 评论 0原文

比如说，我有一段代码从循环中调用某个函数数百万次，并且我希望代码能够快速：

def outer_function(file):
    for line in file:
        inner_function(line)

def inner_function(line):
    # do something
    pass

它不一定是文件处理，它可以是例如从函数绘图线调用的函数绘图点。这个想法是，逻辑上这两者必须分开，但从性能的角度来看，它们应该尽可能快地一起行动。

Python 会自动检测并优化这些东西吗？如果没有 - 有没有办法给它一个线索来这样做？也许使用一些额外的外部优化器？...

原文

Say, I have a code which calls some function millions time from loop and I want the code to be fast:

def outer_function(file):
    for line in file:
        inner_function(line)

def inner_function(line):
    # do something
    pass

It's not necessarily a file processing, it could be for example a function drawing point called from function drawing line. The idea is that logically these two have to be separated, but from performance point of view they should act together as fast as possible.

Does Python detects and optimizes such things automatically? If not - is there a way to give it a clue to do so? Use some additional external optimizer maybe?...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

﹏半生如梦愿梦如真 2024-12-08 20:49:51

由于 Python 的动态特性，Python 不进行内联函数调用。理论上，inner_function 可以执行一些操作，将名称 inner_function 重新绑定到其他内容 - Python 在编译时无法知道这可能会发生。例如：

def func1():
    global inner_func
    inner_func = func2
    print 1

def func2():
    print 2

inner_func = func1

for i in range(5):
    inner_func()

印刷品：

您可能认为这很可怕。然后，再想一想——Python 的函数式和动态特性是其最吸引人的特性之一。 Python 允许的很多功能都是以性能为代价的，在大多数情况下这是可以接受的。

也就是说，您可能可以使用 byteplay 或类似工具将某些东西组合在一起 - 拆解内部将函数转换为字节码并将其插入到外部函数中，然后重新组装。再想一想，如果您的代码对性能至关重要，足以保证此类黑客攻击，只需用 C 重写即可。Python 对于 FFI 有很好的选择。

这都与官方 CPython 实现相关。运行时 JITting 解释器（如 PyPy 或不幸已不复存在的 Unladen Swallow）理论上可以检测正常情况并执行内联。唉，我对 PyPy 不太熟悉，不知道它是否能做到这一点，但它绝对可以。

Python does not inline function calls, because of its dynamic nature. Theoretically, inner_function can do something that re-binds the name inner_function to something else - Python has no way to know at compile time this might happen. For example:

def func1():
    global inner_func
    inner_func = func2
    print 1

def func2():
    print 2

inner_func = func1

for i in range(5):
    inner_func()

Prints:

You may think this is horrible. Then, think again - Python's functional and dynamic nature is one of its most appealing features. A lot of what Python allows comes at the cost of performance, and in most cases this is acceptable.

That said, you can probably hack something together using a tool like byteplay or similar - disassemble the inner function into bytecode and insert it into the outer function, then reassemble. On second thought, if your code is performance-critical enough to warrant such hacks, just rewrite it in C. Python has great options for FFI.

This is all relevant to the official CPython implementation. A runtime-JITting interpreter (like PyPy or the sadly defunct Unladen Swallow) can in theory detect the normal case and perform inlining. Alas, I'm not familiar enough with PyPy to know whether it does this, but it definitely can.

回复收藏 0 原文

北渚 2024-12-08 20:49:51

哪种 Python？ PyPy 的 JIT 编译器将在几百次或十几次（取决于每次迭代执行了多少操作码）迭代之后开始跟踪执行，一路上忘记 Python 函数调用，并将收集到的信息编译成一段优化的机器代码可能没有任何使函数调用本身发生的逻辑残余。跟踪是线性的，JIT 的后端甚至不知道有一个函数调用，它只是看到两个函数的指令在执行时混合在一起。（这是完美的情况，例如，当循环中有分支或所有迭代都采用相同的分支时。某些代码不适合这种 JIT 编译，并且在产生很大的加速之前使跟踪快速无效，尽管这相当多很少见。）

现在，CPython（许多人在谈到“Python”或 Python 解释器时的意思）并不是那么聪明。它是一个简单的字节码虚拟机，将尽职尽责地执行与在每次迭代中一次又一次调用函数相关的逻辑。但话又说回来，如果性能那么很重要，那么为什么还要使用解释器呢？如果将此类开销保持在尽可能低的水平非常重要，请考虑在本机代码中（例如作为 C 扩展或在 Cython 中）编写热循环。

除非每次迭代只进行少量的数字运算，否则无论如何都不会获得很大的改进。

回复收藏 0 原文