ML 中的派生类型表达式
所有,
我想在 ML 中导出以下函数的类型表达式:
fun f x y z = y (x z)
现在我知道输入相同的内容会生成类型表达式。但我希望手工得出这些值。
另外,请提及派生类型表达式时要遵循的一般步骤。
All,
I want to derive the type expression for the function below in ML:
fun f x y z = y (x z)
Now I know typing the same would generate the type expression. But I wish to derive these values by hand.
Also, please mention the general steps to follow when deriving type expressions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我将尝试以尽可能机械的方式来完成此操作,就像大多数编译器中的实现一样。
让我们分解一下:
这基本上是糖:
让我们添加一些元语法类型变量(这些不是真正的 SML 类型,只是本示例中的占位符):
好的,所以我们可以从此开始生成一个约束系统。 T5是f的最终返回类型。目前,我们将把整个函数的最终类型称为“TX”——一些新鲜的、未知的类型变量。
因此,在您给出的示例中将产生约束的是函数应用程序。它告诉我们表达式中事物的类型。事实上,这是我们唯一掌握的信息!
那么这些应用程序告诉我们什么?
忽略我们上面分配的类型变量,让我们只看一下函数体:
z 没有应用于任何东西,所以我们只需查找我们之前分配给它的类型变量(T4)并使用它作为其类型。
x 应用于 z,但我们还不知道它的返回类型,因此让我们为其生成一个新的类型变量,并使用我们之前分配给 x (T2) 的类型来创建约束:
y 应用于 ( xz),我们称之为 T7。再一次,我们还不知道 y 的返回类型,所以我们只给它一个新变量:
我们还知道 y 的返回类型是函数整个主体的返回类型,我们称之为“T5”较早,所以我们添加了约束:
为了紧凑性,我将对此进行一些整理,并根据函数返回函数的事实为 TX 添加一个约束。这可以通过完全相同的方法导出,只是稍微复杂一些。如果您不相信我们最终会得到这个约束,希望您可以自己做这个练习:
现在我们收集所有约束:
我们开始通过将左侧替换为右侧来求解这个方程组约束系统以及原始表达式中,从最后一个约束开始,一直到顶部。
好吧,目前看来这很糟糕。我们现在并不真正需要表达式的整个主体 - 它只是为了提供一些清晰的解释。基本上在符号表中我们会有这样的东西:
最后一步是将剩下的所有类型变量概括为我们所知道和喜爱的更熟悉的多态类型。基本上这只是一次传递,将第一个未绑定类型变量替换为“a”,将第二个变量替换为“b”,依此类推。
我很确定您会发现您的 SML 编译器也会为该术语建议的类型。我是凭记忆手工完成的,所以如果我在某个地方搞砸了一些东西,我深表歉意:p
我发现很难找到这个推理和类型约束过程的良好解释。我用了两本书来学习它:Andrew Appel 的《ML 中的现代编译器实现》和 Pierce 的《类型和编程语言》。这两个人都不能独立地完全启发我,但在他们两个之间我找到了答案。
I'm going to try to do this in the most mechanical way possible, exactly as the implementation in most compilers would.
Let's break it down:
This is basically sugar for:
Let's add some meta-syntactic type variables (these are not real SML-types, just place holders for this example's sake):
OK, so we can start generating a system of constraints from this. T5 is the eventual return type of f. For the moment, we're going to just call the eventual type of this whole function "TX" - some fresh, unknown type variable.
So the thing that is going to be generating constraints in the example you've given is function application. It tells us about the types of things in the expression. In fact, it's the only information we have!
So what do the applications tell us?
Ignoring the type variables we assigned above, let's just look at the body of the function:
z is not applied to anything, so we're going to just look up what the type variable we assigned to it was earlier (T4) and use that as its type.
x is applied to z, but we don't know its return type yet, so let's generate a fresh type variable for that and use the type we assigned x (T2) earlier to create a constraint:
y is applied to the result of (x z), which we just called T7. Once again, we don't know the return type of y yet, so we'll just give it a fresh variable:
We also know that the return type of y is the return type for the whole body of the function, we we called "T5" earlier, so we add the constraint:
For compactness, I'm going to kludge this a little and add a constraint for TX based on the fact that there are functions being returned by functions. This is derivable by exactly the same method, except it's a little more complex. Hopefully you can do this yourself as an exercise if you're not convinced that we would eventually end up with this constraint:
Now we collect all the constraints:
We start to solve this system of equations by substituting left hand sides with right hand sides in the system of constraints, as well as in the original expression, starting from the last constraint and working our way to the top.
OK, so this looks horrible at the moment. We don't really need the whole body of the expression sitting around at the moment - it was just there to provide some clarity in the explanation. Basically in the symbol table we would have something like this:
The last step is to generalise all the type variables that are left over into the more familiar polymorphic types that we know and love. Basically this is just a pass, replacing the first unbound type variable with 'a, the second with 'b and so on.
Which I'm pretty sure you'll find is the type that your SML compiler will suggest for that term too. I did this by hand and from memory, so apologies if I've botched something somewhere :p
I found it difficult to find a good explanation of this inference and type constraint process. I used two books to learn it - 'Modern Compiler Implementation in ML' by Andrew Appel, and 'Types and Programming Languages' by Pierce. Neither one was independently completely illuminating for me, but between the two of them I figured it out.
要确定某物的类型,您需要查看使用它的每个地方。例如,如果您看到
val h = hd l
,则您知道l
是一个列表(因为hd
将列表作为参数)并且您还知道h
的类型是l
的列表类型。因此,假设h
的类型是a
,而l
的类型是a list
(其中a
是占位符)。现在,如果您看到val h2 = h*2
,您就知道h
和h2
是int
,因为2
是一个 int,您可以将一个 int 与另一个 int 相乘,两个 int 相乘的结果是一个 int。由于我们之前说过h
的类型是a
这意味着a
是int
,因此h
的类型是a
code>l 是int list
。因此,让我们来处理您的函数:
让我们按照计算顺序考虑表达式:首先执行
x z
,即将x
应用于z
>。这意味着 x 是一个函数,因此它的类型为 a -> b。由于z
作为函数的参数给出,因此它的类型必须为a
。因此,x z
的类型是b
,因为它是x
的结果类型。现在使用
x z
的结果调用y
。这意味着y
也是一个函数,其参数类型是x
的结果类型,即b
。所以y
的类型为b -> c
.同样,表达式y (xz)
的类型是c
,因为它是y
的结果类型。由于这些都是函数中的表达式,因此我们无法进一步限制类型,因此最常见的类型为
x
、y
和z
是'a -> 'b
,'b ->分别为 'c
和'a
,整个表达式的类型为'c
。这意味着
f
的整体类型为('a -> 'b) -> ('b -> 'c)-> '一-> 'c
有关如何以编程方式推断类型的说明,请阅读 Hindley–Milner 类型推理。
To determine the type of something you need to look at every place where it is used. For example if you see
val h = hd l
, you know thatl
is a list (becausehd
takes a list as an argument) and you also know that the type ofh
is the type thatl
is a list of. So let's say the type ofh
isa
and the type ofl
isa list
(wherea
is a placeholder). Now if you seeval h2 = h*2
, you know thath
andh2
areint
s, because2
is an int, you can multiply an int with another int and the result of multiplying two ints is an int. Since we previously said the type ofh
isa
this means thata
isint
, so the type ofl
isint list
.So let's tackle your function:
Let's consider the expressions in the order in which they are evaluated: First you do
x z
, i.e. you applyx
toz
. That meansx
is a function, so it has the typea -> b
. Sincez
is given as an argument to the function it has to have the typea
. The type ofx z
is thereforb
because that is the result type ofx
.Now
y
is called with the result ofx z
. This meansy
is also a function and its argument type is the result type ofx
, which isb
. Soy
has the typeb -> c
. Again the type of the expressiony (x z)
is thereforc
because that is the result type ofy
.Since those are all the expressions in the function, we cannot restrict the types any further and therefor the most general types for
x
,y
andz
are'a -> 'b
,'b -> 'c
and'a
respectively and the type of the whole expression is'c
.This means the overall type of
f
is('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
For an explanation of how types are inferred programatically read about Hindley–Milner type inference.
解释类型推断的另一种方法是为每个(子)表达式和每个(子)模式分配一个类型变量。
然后,程序中的每个构造都有一个方程,该方程涉及与该构造相关的那些类型变量。
例如,如果程序包含 fx
'a1 是 f 的类型变量,'a2 是 x 的类型变量,'a3 是“f x”的类型变量,
则应用结果为类型方程:
'a1 = 'a2 -> 'a3
然后,类型推断基本上涉及求解声明的类型方程组。对于机器学习来说,这只需使用统一即可完成,并且手动完成非常容易。
Another way to explain type inference is that every (sub)-expression and every (sub)-pattern are assigned a type variable.
Then, each construct in the program has an equation relating those type variables that are relevant to that construct.
E.g., if the program contains f x
and 'a1 is the type variable for the f, and 'a2 the type variable for the x, and 'a3 is the type variable for "f x",
then the application results in the type equation:
'a1 = 'a2 -> 'a3
Then, type inference basically involves solving the set of type equations for a declaration. For ML this is done just by using unification, and it's pretty easy to do by hand.