RPython 中的静态类型是什么？

发布于 2024-12-01 00:23:06 字数 1049 浏览 0 评论 0原文

人们经常说 RPython （Python 的子集） ) 是静态类型的。（例如在 Wikipedia 上。）

最初，我想知道他们如何将其添加到 Python 中，并认为他们可能添加了在每个函数的开头添加诸如 assert isinstance(arg1, ...) 之类的语句的要求（但我真的不敢相信）。

然后我查看了一些 RPython 代码，它看起来根本不是静态类型的。在许多情况下，编译器可能可以证明函数参数只能是某些类型，但绝对不是在所有情况下都是如此。

例如，这是 string.split 的 RPython 实现：

def split(value, by, maxsplit=-1):
    bylen = len(by)
    if bylen == 0:
        raise ValueError("empty separator")

    res = []
    start = 0
    while maxsplit != 0:
        next = value.find(by, start)
        if next < 0:
            break
        res.append(value[start:next])
        start = next + bylen
        maxsplit -= 1   # NB. if it's already < 0, it stays < 0

    res.append(value[start:len(value)])
    return res

在有关 RPython 的 PyPy 文档中，据说：“变量应该最多包含一种类型的值”。

那么，函数参数也算作变量吗？或者说 RPython 在什么意义上是静态类型的？或者这实际上是错误的表述？

原文

It is often stated that RPython (a subset of Python) is statically typed. (E.g. on Wikipedia.)

Initially, I wondered how they would add that to Python and thought that they might have added the requirement to add statements such as assert isinstance(arg1, ...) at the beginning of each function (but I couldn't really believe that).

Then I looked at some RPython code and it doesn't really look statically typed at all. In many cases, it might be that the compiler can prove that a function argument can only be of certain types but definitely not in all cases.

E.g., this is the RPython implementation of string.split:

def split(value, by, maxsplit=-1):
    bylen = len(by)
    if bylen == 0:
        raise ValueError("empty separator")

    res = []
    start = 0
    while maxsplit != 0:
        next = value.find(by, start)
        if next < 0:
            break
        res.append(value[start:next])
        start = next + bylen
        maxsplit -= 1   # NB. if it's already < 0, it stays < 0

    res.append(value[start:len(value)])
    return res

In the PyPy documentation about RPython, it is said: "variables should contain values of at most one type".

So, do function arguments also count as variables? Or in what sense is RPython statically typed? Or is this actually misstated?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青衫儰鉨ミ守葔 2024-12-08 00:23:06

那么，函数参数也算作变量吗？

当然有。他们总是用几乎所有语言。

或者说 RPython 在什么意义上是静态类型的？或者这实际上是错误的表述？

该说法是正确的。 RPython 不是 Python。它是 Python 的子集，并且可以作为 Python 代码运行。但是当你实际编译 RPython 代码时，你会失去很多动态性（尽管只有在导入时间之后，所以你仍然可以使用元类，从字符串生成代码等 - 在某些模块中使用得很好），编译器（它不是 Python 编译器，但与传统编译器有很大不同；请参阅相关文档）确实可以决定静态使用类型。更准确地说，使用动态性的代码使其通过了解析器和所有内容，但在某些时候会导致类型错误。

在许多情况下，编译器可能可以证明函数参数只能是某些类型，但绝对不能在所有情况下都如此。

当然不是。有很多代码不是静态类型的，而且相当多的静态类型代码当前注释器无法证明是静态类型的。但是当遇到这样的代码时，那就是一个编译错误，句号。

有几点需要注意：

类型是推断出来的，没有明确说明（嗯，在大多数情况下；我相信有一些函数需要断言来帮助注释者）。静态类型并不（正如您在评论中暗示的那样）意味着必须写出类型（称为清单类型），它意味着每个表达式（包括变量）都有一个永远不会改变的类型。
所有分析都是在整个程序的基础上进行的！人们无法推断函数的（非泛型）类型 def add(a, b): return a + b （参数可能是整数、浮点数、字符串、列表等），但如果使用整数参数（例如整数文字或之前推断为包含整数的变量）调用该函数，则确定 a 和 b （并且，通过+ 的类型， add 的结果）也是整数。
并非 PyPy 存储库中的所有代码都是 RPython。例如，有一些代码生成器（例如在 rlib.parsing 中）在编译时运行并生成 RPython 代码，但不是 RPython（通常带有 "NOT_RPYTHON" 文档字符串），顺便一提）。此外，标准库的大部分都是用完整的 Python 编写的（大部分直接取自 CPython）。

关于整个翻译和打字的实际工作原理，有很多非常有趣的材料。例如，RPython 工具链描述了一般的翻译过程，包括类型推理，RPython Typer 描述了所使用的类型系统。

So, do function arguments also count as variables?

Of course they do. They always do in pretty much every language.

Or in what sense is RPython statically typed? Or is this actually misstated?

The statement is correct. RPython is not Python. Well, it's a subset of it and can be run as Python code. But when you actually compile RPython code, so much dynamicness is taken away from you (albeit only after import time, so you can still use metaclasses, generate code from strings, etc. - used to great effect in some modules) that the compiler (which is not the Python compiler, but vastly different from traditional compilers; see associated documentation) can indeed decide types are used statically. More accurately, code that uses dynamicness makes it past the parser and everything, but results in a type error at some point.

In many cases, it might be that the compiler can prove that a function argument can only be of certain types but definitely not in all cases.

Of course not. There's a lot of code that's not statically typed, and quite some statically-typed code the current annotator can't prove to be statically typed. But when such code is enountered, it's a compilation errors, period.

There are a few points that are important to realize:

Types are inferred, not stated explicitly (well, for the most part; I believe there are a few functions that need assertions to help the annotator). Static typing does not (as you seem to imply in a comment) mean that the type has to be written out (that's called manifest typing), it means that each expression (that includes variables) has a single type that never changes.
All that analysis happens on a whole-program basis! One can't infer a (non-generic) type for a function def add(a, b): return a + b (the arguments might ints, floats, strings, lists, etc.), but if the function is called with integer arguments (e.g. integer literals or variables that were previously inferred to contain integers), it is determined that a and b (and, by the type of +, the result of add) are integers too.
Not all code in the PyPy repository is RPython. For example, there are code generators (e.g. in rlib.parsing) that run at compile time and produce RPython code, but are not RPython (frequently with a "NOT_RPYTHON" docstring, by the way). Also, large parts of the standard library are written in full Python (mostly taken straight from CPython).

There's a lot of very interesting material on how the whole translation and typing actually works. For example, The RPython Toolchain describes the translation process in general, including type inference, and The RPython Typer describes the type system(s) used.

回复收藏 0 原文