PyPy——它怎么可能打败 CPython?

发布于 2024-08-27 14:16:19 字数 485 浏览 4 评论 0原文

来自 Google 开源博客

PyPy 是 Python 的重新实现 在Python中,使用先进的技术 努力获得更好的表现 比 CPython 。多年的努力 终于得到了回报。我们的速度 结果常常击败 CPython,范围不等 从稍微慢一点,到 实际速度提升高达 2 倍 应用程序代码,加速高达 小基准测试上的 10 倍。

这怎么可能?使用哪个 Python 实现来实现 PyPy? CPython? PyPyPy 或 PyPyPyPy 超过其分数的机会有多大?

(相关说明......为什么有人会尝试这样的事情?)

From the Google Open Source Blog:

PyPy is a reimplementation of Python
in Python, using advanced techniques
to try to attain better performance
than CPython. Many years of hard work
have finally paid off. Our speed
results often beat CPython, ranging
from being slightly slower, to
speedups of up to 2x on real
application code, to speedups of up to
10x on small benchmarks.

How is this possible? Which Python implementation was used to implement PyPy? CPython? And what are the chances of a PyPyPy or PyPyPyPy beating their score?

(On a related note... why would anyone try something like this?)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦里人 2024-09-03 14:16:19

恕我直言,“PyPy 是 Python 中 Python 的重新实现”是一种相当具有误导性的描述 PyPy 的方式,尽管它在技术上是正确的。

PyPy 有两个主要部分。

  1. 翻译框架
  2. 解释器

翻译框架是一个编译器。它将 RPython 代码编译为 C(或其他目标),自动添加垃圾收集和 JIT 编译器等方面。它不能处理任意Python代码,只能处理RPython。

RPython 是普通 Python 的子集;所有 RPython 代码都是 Python 代码,但反之则不然。 RPython 没有正式的定义,因为 RPython 基本上只是“可以由 PyPy 的翻译框架翻译的 Python 子集”。但为了进行翻译,RPython 代码必须是静态类型(类型是推断出来的,你不声明它们,但它仍然严格是每个变量一种类型),而且你不能这样做诸如在运行时声明/修改函数/类之类的事情。

解释器是用 RPython 编写的普通 Python 解释器。

由于 RPython 代码是普通的 Python 代码,因此您可以在任何 Python 解释器上运行它。但 PyPy 所声称的速度都不是来自于以这种方式运行。这只是为了快速测试周期,因为翻译解释器需要很长时间。

理解了这一点,应该立即显而易见的是,关于 PyPyPy 或 PyPyPyPy 的猜测实际上没有任何意义。您有一个用 RPython 编写的解释器。您将其转换为可以快速执行 Python 的 C 代码。过程到此就停止了;不再需要 RPython 通过再次处理来加速。

因此,“PyPy 怎么可能比 CPython 更快”也变得相当明显。 PyPy 有更好的实现,包括 JIT 编译器(我相信,如果没有 JIT 编译器,它通常不会那么快,这意味着 PyPy 仅对于易受 JIT 编译影响的程序更快)。 CPython 从未被设计为 Python 语言的高度优化实现(尽管他们确实尝试使其成为高度优化的实现,如果您遵循差异的话)。


PyPy 项目真正的创新之处在于,他们不需要手动编写复杂的 GC 方案或 JIT 编译器。他们用 RPython 相对简单地编写解释器,尽管 RPython 的级别比 Python 低,但它仍然是一种面向对象的垃圾收集语言,比 C 高级得多。然后翻译框架自动添加以下内容: GC 和 JIT。因此,翻译框架是一项巨大的工作,但它同样适用于 PyPy python 解释器,无论它们如何改变其实现,从而在提高性能的实验中提供更大的自由度(无需担心引入 GC bug 或更新 JIT 编译器以应对更改)。这也意味着当他们开始实现 Python3 解释器时,它将自动获得相同的好处。以及使用 PyPy 框架编写的任何其他解释器(其中有许多处于不同的完善阶段)。所有使用 PyPy 框架的解释器都会自动支持该框架支持的所有平台。

因此,PyPy 项目的真正好处是(尽可能多地)分离出为动态语言实现高效的独立于平台的解释器的所有部分。然后在一个地方提出一种很好的实现方式,可以在许多解释器中重复使用。这并不像“我的 Python 程序现在运行得更快”那样立竿见影,但它是未来的美好前景。

而且它可以更快地运行你的 Python 程序(也许)。

"PyPy is a reimplementation of Python in Python" is a rather misleading way to describe PyPy, IMHO, although it's technically true.

There are two major parts of PyPy.

  1. The translation framework
  2. The interpreter

The translation framework is a compiler. It compiles RPython code down to C (or other targets), automatically adding in aspects such as garbage collection and a JIT compiler. It cannot handle arbitrary Python code, only RPython.

RPython is a subset of normal Python; all RPython code is Python code, but not the other way around. There is no formal definition of RPython, because RPython is basically just "the subset of Python that can be translated by PyPy's translation framework". But in order to be translated, RPython code has to be statically typed (the types are inferred, you don't declare them, but it's still strictly one type per variable), and you can't do things like declaring/modifying functions/classes at runtime either.

The interpreter then is a normal Python interpreter written in RPython.

Because RPython code is normal Python code, you can run it on any Python interpreter. But none of PyPy's speed claims come from running it that way; this is just for a rapid test cycle, because translating the interpreter takes a long time.

With that understood, it should be immediately obvious that speculations about PyPyPy or PyPyPyPy don't actually make any sense. You have an interpreter written in RPython. You translate it to C code that executes Python quickly. There the process stops; there's no more RPython to speed up by processing it again.

So "How is it possible for PyPy to be faster than CPython" also becomes fairly obvious. PyPy has a better implementation, including a JIT compiler (it's generally not quite as fast without the JIT compiler, I believe, which means PyPy is only faster for programs susceptible to JIT-compilation). CPython was never designed to be a highly optimising implementation of the Python language (though they do try to make it a highly optimised implementation, if you follow the difference).


The really innovative bit of the PyPy project is that they don't write sophisticated GC schemes or JIT compilers by hand. They write the interpreter relatively straightforwardly in RPython, and for all RPython is lower level than Python it's still an object-oriented garbage collected language, much more high level than C. Then the translation framework automatically adds things like GC and JIT. So the translation framework is a huge effort, but it applies equally well to the PyPy python interpreter however they change their implementation, allowing for much more freedom in experimentation to improve performance (without worrying about introducing GC bugs or updating the JIT compiler to cope with the changes). It also means when they get around to implementing a Python3 interpreter, it will automatically get the same benefits. And any other interpreters written with the PyPy framework (of which there are a number at varying stages of polish). And all interpreters using the PyPy framework automatically support all platforms supported by the framework.

So the true benefit of the PyPy project is to separate out (as much as possible) all the parts of implementing an efficient platform-independent interpreter for a dynamic language. And then come up with one good implementation of them in one place, that can be re-used across many interpreters. That's not an immediate win like "my Python program runs faster now", but it's a great prospect for the future.

And it can run your Python program faster (maybe).

哭泣的笑容 2024-09-03 14:16:19

Q1。这怎么可能?

在某些情况下,手动内存管理(CPython 的计数方式)可能比自动管理慢。

CPython 解释器实现的限制妨碍了 PyPy 可以进行的某些优化(例如细粒度锁)。

正如马塞洛提到的,JIT。能够即时确认对象的类型可以使您无需进行多次指针取消引用以最终到达您想要调用的方法。

第二季度。使用哪种 Python 实现来实现 PyPy?

PyPy 解释器是在 RPython 中实现的,RPython 是 Python(Python 语言而不是 CPython 解释器)的静态类型子集。 - 请参阅https://pypy.readthedocs.org/en/latest/architecture.html

第三季度。 PyPyPy 或 PyPyPyPy 超过其分数的机会有多大?

这将取决于这些假设解释器的实现。例如,如果其中一个获取源代码,对其进行某种分析,并在运行一段时间后将其直接转换为严格的目标特定汇编代码,我想它会比 CPython 快得多。

更新:最近,在精心制作的示例,PyPy 的性能优于使用 gcc -O3 编译的类似 C 程序。这是一个人为的案例,但确实展示了一些想法。

第四季度。为什么有人会尝试这样的事情?

来自官方网站。 https://pypy.readthedocs.org/en/latest/architecture.html #使命宣言

我们的目标是提供:

  • 用于制作的通用翻译和支持框架
    动态语言的实现,强调干净
    语言规范和实现之间的分离
    方面。我们称之为 RPython 工具链_。

  • Python_ 的兼容、灵活且快速的实现
    使用上述工具链来启用新的高级语言
    高级功能,而无需对低级功能进行编码
    详情。

通过以这种方式分离关注点,我们的 Python 实现 - 和
其他动态语言 - 能够自动生成
适用于任何动态语言的即时编译器。它还允许
实施决策的混合搭配方法,包括许多
历史上一直处于用户控制之外的内容,例如
目标平台、内存和线程模型、垃圾收集
策略和应用的优化,包括是否
首先要有 JIT。

C编译器gcc是用C实现的,Haskell编译器GHC是用Haskell编写的。你有什么理由不使用 Python 编写 Python 解释器/编译器吗?

Q1. How is this possible?

Manual memory management (which is what CPython does with its counting) can be slower than automatic management in some cases.

Limitations in the implementation of the CPython interpreter preclude certain optimisations that PyPy can do (eg. fine grained locks).

As Marcelo mentioned, the JIT. Being able to on the fly confirm the type of an object can save you the need to do multiple pointer dereferences to finally arrive at the method you want to call.

Q2. Which Python implementation was used to implement PyPy?

The PyPy interpreter is implemented in RPython which is a statically typed subset of Python (the language and not the CPython interpreter). - Refer https://pypy.readthedocs.org/en/latest/architecture.html for details.

Q3. And what are the chances of a PyPyPy or PyPyPyPy beating their score?

That would depend on the implementation of these hypothetical interpreters. If one of them for example took the source, did some kind of analysis on it and converted it directly into tight target specific assembly code after running for a while, I imagine it would be quite faster than CPython.

Update: Recently, on a carefully crafted example, PyPy outperformed a similar C program compiled with gcc -O3. It's a contrived case but does exhibit some ideas.

Q4. Why would anyone try something like this?

From the official site. https://pypy.readthedocs.org/en/latest/architecture.html#mission-statement

We aim to provide:

  • a common translation and support framework for producing
    implementations of dynamic languages, emphasizing a clean
    separation between language specification and implementation
    aspects. We call this the RPython toolchain_.

  • a compliant, flexible and fast implementation of the Python_
    Language which uses the above toolchain to enable new advanced
    high-level features without having to encode the low-level
    details.

By separating concerns in this way, our implementation of Python - and
other dynamic languages - is able to automatically generate a
Just-in-Time compiler for any dynamic language. It also allows a
mix-and-match approach to implementation decisions, including many
that have historically been outside of a user's control, such as
target platform, memory and threading models, garbage collection
strategies, and optimizations applied, including whether or not to
have a JIT in the first place.

The C compiler gcc is implemented in C, The Haskell compiler GHC is written in Haskell. Do you have any reason for the Python interpreter/compiler to not be written in Python?

南城旧梦 2024-09-03 14:16:19

PyPy 是用 Python 实现的,但它实现了 JIT 编译器来动态生成本机代码。

在 Python 之上实现 PyPy 的原因可能是它只是一种非常高效的语言,特别是因为 JIT 编译器使宿主语言的性能有些无关紧要。

PyPy is implemented in Python, but it implements a JIT compiler to generate native code on the fly.

The reason to implement PyPy on top of Python is probably that it is simply a very productive language, especially since the JIT compiler makes the host language's performance somewhat irrelevant.

刘备忘录 2024-09-03 14:16:19

PyPy 是用受限 Python 编写的。据我所知,它并不在 CPython 解释器之上运行。受限Python 是Python 语言的一个子集。 AFAIK,PyPy 解释器被编译为机器代码,因此安装后它在运行时不会使用 python 解释器。

您的问题似乎期望 PyPy 解释器在执行代码时在 CPython 之上运行。
编辑:是的,要使用 PyPy,您首先需要将 PyPy python 代码转换为 C 并使用 gcc 构建、jvm 字节代码或 .Net CLI 代码。请参阅入门

PyPy is written in Restricted Python. It does not run on top of the CPython interpreter, as far as I know. Restricted Python is a subset of the Python language. AFAIK, the PyPy interpreter is compiled to machine code, so when installed it does not utilize a python interpreter at runtime.

Your question seems to expect the PyPy interpreter is running on top of CPython while executing code.
Edit: Yes, to use PyPy you first translate the PyPy python code, either to C and build with gcc, to jvm byte code, or to .Net CLI code. See Getting Started

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文