对于静态类型语言来说意味着什么?
我的理解是,这意味着人们可以编写一个程序来正式证明用静态类型语言编写的程序将不存在某些(小)缺陷子集。
我的问题如下:
假设我们有两种图灵完备语言,A 和 B。A 被认为是“类型安全”,而“B”被认为不是。假设给我一个程序 L 来检查用 A 编写的任何程序的正确性。什么是阻止我应用 L 将用 B 编写的任何程序翻译为 A。如果 P 从 A 翻译为 B 那么为什么 PL 不是 a对于用 B 编写的任何程序都有效的类型检查器?
我接受过代数方面的培训,并且刚刚开始学习计算机科学,因此可能有一些明显的原因表明这不起作用,但我非常想知道。一段时间以来,整个“类型安全”的事情让我觉得很可疑。
My understanding is that it means that one can potentially write a program to formally prove that a program written in a statically typed language will be free of a certain (small) subset of defects.
My problem with this is as follows:
Assume that we have two turing complete languages, A and B. A is presumed to be 'type safe' and 'B' is presumed not to be. Suppose I am given a program L to check the correctness of any program written in A. What is to stop me from translating any program written in B to A, applying L. If P translates from A to B then why isn't PL a valid type checker for any program written in B?
I'm trained in Algebra and am only just starting to study CS so there might be some obvious reason that this doesn't work but I would very much like to know. This whole 'type safety' thing has smelt fishy to me for a while.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果你可以将每个 B'(用 B 编写的程序)翻译成等价的 A'(如果 B' 是正确的,则这是正确的),那么语言 B 也同样享有同样的待遇“类型安全”作为语言 A(当然,在理论意义上;-)——基本上这意味着 B 可以进行完美的类型推断。但这对于动态语言来说是极其有限的——例如,考虑一下:
其中
thefun
(让我们假设)对于 int 参数是类型安全的,但对于 str 参数则不是。现在——你首先如何将其翻译成语言A
......?If you can translate every B' (a program written in B) into an equivalent A' (which is correct if B' is), then language B enjoys just as much "type-safety" as language A (in a theoretical sense, of course;-) -- basically this would mean that B is such that you can do perfect type inferencing. But that's extremely limited for a dynamic language -- e.g., consider:
where
thefun
(let's assume) is typesafe for int argument, but not for str argument. Now -- how do you translate this to languageA
in the first place...?设 A 为图灵完备语言,它应该是静态类型的,并设 A' 为删除类型检查时从 A 获得的语言(但不是类型注释,因为它们还用于其他目的)。 A 接受的节目将是A' 接受的节目的子集。特别是,A' 也将是图灵完备的。
给定你的翻译 P 从 B 到 A(反之亦然)。它应该做什么?它可以执行以下两件事之一:
首先,它可以将 B 的每个程序 y 转换为 A 的程序。在这种情况下,LPy 将始终返回 True,因为 A 的程序根据定义正确键入。
其次,P可以将B的每个程序y翻译成A'的程序。在这种情况下,如果 Py 恰好是 A 的程序,则 LPy 将返回 True,否则返回 False。
由于第一种情况不会产生任何有趣的结果,所以让我们继续讨论第二种情况,这可能就是您的意思。 B 的程序中定义的函数 LP 是否告诉我们有关 B 的程序的任何有趣的信息?我说不,因为它在 P 的变化下不是不变的。由于 A 是图灵完备的,即使在第二种情况下,也可以选择 P,使其图像恰好位于 A 中。那么 LP 将始终为 True。另一方面,可以选择P,以便将一些程序映射到A'中A的补集。在这种情况下,LP 会对 B 的某些(可能是所有)程序吐出 False。正如您所看到的,您不会得到仅取决于 y 的任何内容。
我还可以用以下方式更数学地表达它:存在一个编程语言类别 C,其对象是编程语言,其态射是从一种编程语言到另一种编程语言的翻译器。特别是如果存在态射P:X→ Y,Y 至少与 X 一样具有表达能力。每对图灵完备语言之间都存在两个方向的态射。对于 C 的每个对象 X(即每种编程语言),我们都有一个关联的集合,例如可以由 X 的程序计算的那些部分定义的函数的 {X}(我知道这是不好的符号)。每个态射 P:X - > Y 然后引发包含 {X} -> Y {Y} 组。让我们正式反转所有这些态射 P: X -> Y 诱导恒等式{X}-> {Y}。我将所得类别(用数学术语来说是 C 的本地化)称为 C'。现在包含 A -> A' 是 C' 的态射。然而,它在A'的自同构下不被保留,即态射A→>。 A' 不是 A' 在 C' 中的不变量。换句话说:从这个抽象的角度来看,“静态类型”属性是不可定义的,并且可以任意附加到语言上。
为了使我的观点更清楚,您还可以将 C' 视为三维空间中的几何形状的范畴,以及作为态射的欧几里得运动。 A' 和 B 是两个几何形状,P 和 Q 是将 B 带到 A' 的欧几里得运动,反之亦然。例如,A'和B可以是两个球体。现在让我们在A'上固定一个点,它代表A'的子集A。让我们称这一点为“静态类型”。我们想知道 B 的某个点是否是静态类型的。所以我们取这样一个点 y,通过 P 将其映射到 A' 并测试它是否是我们在 A' 上标记的点。人们很容易看出,这取决于所选择的映射 P,或者换句话说:球体上的标记点不会被该球体的自同构(即将球体映射到其自身的欧几里得运动)所保留。
Let A be your Turing-complete language which is supposed to be statically typed and let A' be the language you get from A when you remove the type checking (but not the type annotations because they also serve other purposes). The accepted programs of A will be a subset of the accepted programs of A'. So in particular, A' will also be Turing-complete.
Given your translator P from B to A (and vice versa). What is it supposed to do? It could do one of two things:
Firstly, it could translate every program y of B to a program of A. In this case, LPy would always return True as programs of A are by definition correctly typed.
Secondly, P could translate every program y of B to a program of A'. In this case, LPy would return True if Py happens to be a program of A and False if not.
As the first case doesn't yield anything interesting, let us stick to the second case, which is probably what you mean. Does the function LP defined on programs of B tell us anything interesting about programs of B? I say no, because it is not invariant under a change of P. As A is Turing-complete, even in the second case P could be chosen so that its image happens to lie in A. Then LP would be constantly True. On the other hand, P could be chosen so that some programs are mapped to the complement of A in A'. In this case LP would spit out False for some (possibly all) programs of B. As you can see, you don't get anything which only depends on y.
I can also put it more mathematically in the following way: There is a category C of programming languages whose objects are the programming languages and whose morphisms are translators from one programming language to another one. In particular if there is a morphism P: X -> Y, Y is at least as expressive as X. Between each pair of Turing-complete languages there are morphisms in both directions. For each object X of C (i.e. for each programming language) we have an associated set, say {X} (bad notation, I know) of those partially defined functions that can be computed by programs of X. Each morphism P: X -> Y then induces an inclusion {X} -> {Y} of sets. Let us formally invert all those morphisms P: X -> Y that induce the identity {X} -> {Y}. I will call the resulting category (which is, in mathematical terms, a localization of C) by C'. Now the inclusion A -> A' is a morphism in C'. However, it is not preserved under automorphisms of A', that is the morphism A -> A' is not an invariant of A' in C'. In other words: from this abstract point of view the attribute "statically typed" is not definable and can be arbitrarily attached to a language.
To make my point clearer you can also think of C' as the category of, say, geometrical shapes in three-dimensional space together with the Euclidean motions as morphisms. A' and B are then two geometrical shapes and P and Q are Euclidean motions bringing B to A' and vice versa. For example, A' and B could be two spheres. Now let us fix a point on A', which shall stand for the subset A of A'. Let us call this point "statically typed". We want to know whether a point of B is statically typed. So we take such a point y, map it via P to A' and test, whether it is our marked point on A'. As one can easily see, this depends on the chosen map P or, to put in other words: A marked point on a sphere is not preserved by automorphisms (that are Euclidean motions that map the sphere onto itself) of that sphere.
与已提出的观点相同的另一种方法是,您的问题构成了矛盾证明:
或两者。我的直觉表明,后者可能是症结所在:类型安全是一种元语言属性。
Another way to make the same point as has been made is that your question constitutes a proof by contradiction that either:
or both. My intuition says that the latter is probably the sticking point: that type-safety is a meta-linguistic property.
这没有什么“可疑”的。 ;)
对于任何重要的 [1] 类型系统而言,图灵完备语言的集合是类型安全的 T 是 严格 子集图灵完备的语言。因此,在一般情况下,不存在从 B 到 A 的翻译器 P-1;因此,任何翻译器兼类型检查器 LP-1 也不会。
对这种说法的本能反应可能是:胡说八道! A 和 B 都是图灵完备的,所以我可以用任何一种语言表达任何可计算函数! em> 事实上,这是正确的——您可以用任何一种语言表达任何可计算函数;然而,很多时候,你也可以表达更多。特别是,您可以构造指称语义未明确定义的表达式,例如那些可能乐意尝试对字符串“foo”和“bar”进行算术和的表达式(这是
Chubsdad Alex Martelli 的回答)。这些类型的表达式可能在语言B中“存在”,但可能根本无法在语言A中表达,因为指称语义未定义,因此没有合理的解释。翻译它们的方法。这是静态类型的一大优势:如果您的类型系统无法在编译时证明上述函数将接收除算术加法运算符的结果明确定义的参数之外的任何参数,则它可以因类型错误而被拒绝。
请注意,虽然上面是用来解释静态类型系统优点的常见示例,但它可能太温和了。一般来说,静态类型系统不必仅限于强制参数的类型正确性,而且实际上可以表达可以在编译时证明的程序的任何所需属性。例如,可以构造类型系统来强制释放文件系统句柄(例如到数据库、文件、网络套接字等)。其收购范围相同。显然,这在生命支持系统等领域非常有价值,在这些领域中,尽可能多的系统参数的可证明正确性是绝对必要的。如果你满足静态类型系统,你可以免费获得这些类型的证明。
[1] 我所说的“非平凡”是指并非所有可能的表达式都是类型正确的。
There's nothing "fishy" about it. ;)
The set of Turing-complete languages which are type-safe with respect to any nontrivial [1] type system T is a strict subset of the Turing-complete languages. As such, in the general case, no translator P-1 from B to A exists; therefore, neither does any translator-cum-type-checker LP-1.
A knee-jerk reaction to this sort of claim might be: Nonsense! Both A and B are Turing-complete, so I can express any computable function in either language! And, indeed, this is correct--you can express any computable function in either language; however, quite often, you can also express quite a bit more. In particular you can construct expressions whose denotational semantics are not well-defined, such as those which might happily try to take the arithmetic sum of the character strings "foo" and "bar" (this is the gist of
ChubsdadAlex Martelli's answer). These sorts of expressions may be "in" the language B, but may simply not be expressible in the language A, because the denotational semantics are undefined, thus there is no sensible way to translate them.This is one of the great strengths of static typing: If your type system is unable to prove, at compile time, that the aforementioned function will receive any parameters but those for which the outcome of the arithmetic addition operator is well-defined, it can be rejected as ill-typed.
Note that while the above is the usual sort of example given to explain the merits of a static type system, it is perhaps too modest. In general, a static type system need not be limited to merely enforcing type-correctness of parameters, but indeed can express any desired property of a program which can be proven at compile time. For example, it is possible to construct type systems which enforce the constraint that one release a filesystem handle (e.g. to a database, file, network socket, etc.) within the same scope in which it was acquired. Obviously, this is tremendously valuable in such domains as life-support systems, among others, where provable correctness of as many parameters of the system as possible is absolutely essential. If you satisfy the static type system, you can get these sorts of proofs for free.
[1] By nontrivial, I mean such that not all possible expressions are well-typed.
我的理解是,这与编译时与运行时有关。在静态类型语言中,大部分类型检查是在编译时执行的。在动态类型语言中,大部分类型检查是在运行时执行的。
My understanding is that this has to do with compile-time vs. run-time. In a statically typed language the majority of type checking is performed during compile-time. In a dynamically typed language, the majority of its type checking is performed at run-time.
让我反过来回答这个问题:
有两种不同类型的“动态”编程。
一种是“动态类型”,这意味着您有某种 shell,您可以通过在该 shell 中输入定义来进行编程,可以将其想象为 Python 的 IDLE shell。
另一种类型的动态规划是更理论化的。动态程序是可以改变其自身源的程序。它需要一定程度的内射,并且通常用于更改微控制器上的程序存储器。有时生成用于数字运算的查找表称为动态规划。
Let me answer this the other way round:
There are two different types of "dynamic" programming.
One is "dynamically typed", which means you have some sort of a shell where you can program by typing definitions into that shell, think of it like Python's IDLE shell.
The other type of dynamic programming, is a more theoretical one. A dynamic program, is one that can change its own source. It needs some level of introjection, and is often used to change program memory on microcontrollers. Sometimes generating lookup tables for number crunching is called dynamic programming.