实现类型推断
我在这里看到一些关于静态类型与动态类型的有趣讨论。 我通常更喜欢静态类型,因为编译类型检查、更好的文档化代码等。但是,我确实同意,例如,如果按照 Java 的方式完成,它们确实会使代码变得混乱。
因此,我即将开始构建自己的函数式语言,而类型推断是我想要实现的事情之一。 我确实明白这是一个很大的主题,我并不是想创造一些以前没有做过的东西,只是基本的推理......
有关阅读什么内容的任何指示将有助于我解决这个问题? 最好是更务实/实用的东西,而不是更理论的范畴论/类型理论文本。 如果有一个带有数据结构/算法的实现讨论文本,那就太好了。
I see some interesting discussions here about static vs. dynamic typing.
I generally prefer static typing, due to compile type checking, better documented code, etc. However, I do agree that they do clutter up the code if done the way Java does it, for example.
So I'm about to start building a functional style language of my own, and type inference is one of the things that I want to implement. I do understand that it is a big subject, and I'm not trying to create something that has not been done before, just basic inferencing...
Any pointers on what to read up that will help me with this? Preferably something more pragmatic/practical as opposed to more theoretical category theory/type theory texts. If there's an implementation discussion text out there, with data structures/algorithms, that would just be lovely.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我发现以下资源有助于理解类型推断(按难度递增的顺序排列):
然而,由于最好的学习方法就是实践,我强烈建议通过完成编程语言课程的家庭作业来实现玩具函数式语言的类型推断。
我推荐这两个易于理解的 ML 作业,您可以在一天之内完成它们:
这些作业来自更高级的课程:
多态、存在、递归类型 (PDF)
双向类型检查 (PDF)
子类型和对象 (PDF)
I found the following resources helpful for understanding type inference, in order of increasing difficulty:
However, since the best way to learn is to do, I strongly suggest implementing type inference for a toy functional language by working through a homework assignment of a programming languages course.
I recommend these two accessible homeworks in ML, which you can both complete in less than a day:
These assignments are from a more advanced course:
Implementing MiniML
Polymorphic, Existential, Recursive Types (PDF)
Bi-Directional Typechecking (PDF)
Subtyping and Objects (PDF)
不幸的是,关于这个主题的许多文献都非常丰富。 我也处在你的处境中。 我对这个主题的第一次介绍来自《编程语言:应用程序和解释》
http://www.plai.org/
我将尝试总结抽象的想法,然后总结我没有立即发现的明显细节。 首先,类型推断可以被认为是生成然后解决约束。 要生成约束,您可以递归遍历语法树并在每个节点上生成一个或多个约束。 例如,如果节点是
+
运算符,则操作数和结果必须都是数字。 应用函数的节点与函数的结果具有相同的类型,依此类推。对于没有
let
的语言,可以盲目地通过替换来解决上述约束。 例如:这里,我们可以说if语句的条件必须是布尔型,并且if语句的类型与其
then
和else子句。 由于我们知道
1
和2
是数字,通过替换,我们知道if
语句是一个数字。事情变得令人讨厌的地方,以及我有一段时间无法理解的地方,是处理 let:
在这里,我们将 id 绑定到一个函数,该函数返回您传入的任何内容,也称为恒等函数。 问题在于,每次使用
id
时,函数参数x
的类型都不同。 第二个id
是一个a ->; 类型的函数。 a
,其中a
可以是任何内容。 第一个是(a -> a) -> 类型 (a -> a)
。 这称为let 多态性。 关键是按特定顺序解决约束:首先解决id
定义的约束。 这将是a -> 一个。 然后,可以将
id
类型的新的单独副本替换为使用id
的每个位置的约束,例如a2 -> a2 和 a3 -> a3
。在线资源中并没有轻易解释这一点。 他们会提到算法 W 或 M,但不会提到它们在解决约束方面的工作原理,或者为什么它不会拒绝 let 多态性:这些算法中的每一个都在解决约束时强制执行排序。
我发现这个资源对于将算法 W、M 以及约束生成和求解的一般概念联系在一起非常有帮助。 它有点密集,但比许多更好:
http ://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf
许多其他论文也很好:
http://people.cs.uu.nl/bastiaan/papers.html
我希望这有助于澄清一个有些模糊的世界。
It's unfortunate that much of the literature on the subject is very dense. I too was in your shoes. I got my first introduction to the subject from Programming Languages: Applications and Interpretation
http://www.plai.org/
I'll try to summarize the abstract idea followed by details that I did not find immediately obvious. First, type inference can be thought of generating and then solving constraints. To generate constraints, you recurse through the syntax tree and generate one or more constraints on each node. For example, if the node is a
+
operator, the operands and the results must all be numbers. A node that applies a function has the same type as the result of the function, and so on.For a language without
let
, you can blindly solve the above constraints by substitution. For example:here, we can say that the condition of the if statement must be Boolean, and that the type of the if statement is the same as the type of its
then
andelse
clauses. Since we know1
and2
are numbers, by substitution, we know theif
statement is a number.Where things get nasty, and what I couldn't understand for a while, is dealing with let:
Here, we've bound
id
to a function that returns whatever you've passed in, otherwise known as the identity function. The problem is the type of the function's parameterx
is different on each usage ofid
. The secondid
is a function of typea -> a
, wherea
can be anything. The first is of type(a -> a) -> (a -> a)
. This is known as let-polymorphism. The key is to solve constraints in a particular order: first solve constraints for the definition ofid
. This will bea -> a
. Then fresh, separate copies of the type ofid
can be substituted into the constraints for each placeid
is used, for examplea2 -> a2
anda3 -> a3
.That wasn't readily explained in online resources. They'll mention algorithm W or M but not how they work in terms of solving constraints, or why it doesn't barf on let-polymorphism: each of those algorithms enforce an ordering on solving the constraints.
I found this resource extremely helpful to tie Algorithm W, M, and the general concept of constraint generation and solving all together. It's a little dense, but better than many:
http://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf
Many of the other papers there are nice too:
http://people.cs.uu.nl/bastiaan/papers.html
I hope that helps clarify a somewhat murky world.
除了函数式语言的 Hindley Milner 之外,另一位
动态语言类型推断的流行方法是
抽象解释
。抽象解释的思想是写一个特殊的
语言的解释器,而不是保持一个环境
具体值(1,假,闭包),它适用于抽象值,又名
类型(int、bool 等)。 因为它正在解释程序
抽象值,这就是为什么它被称为抽象解释。
Pysonar2 是抽象解释的优雅实现
对于Python。 Google用它来分析Python
项目。 基本上它使用
访问者模式
来调度对相关 AST 节点的评估调用。 访问者函数
转换
接受
上下文
将在其中评估当前节点,以及返回当前节点的类型。
In addition to Hindley Milner for functional languages, another
popular approach to type inference for dynamic language is
abstract interpretation
.The idea of abstract interpretation is to write a special
interpreter for the language, instead of keep an environment of
concrete values(1, false, closure), it works on abstract values, aka
types(int, bool, etc). Since it's interpreting the program on
abstract values, that's why it's called abstract interpretation.
Pysonar2 is an elegant implementation of abstract interpretation
for Python. It is used by Google to analyze Python
projects. Basically it uses
visitor pattern
to dispatchevaluation call to relevant AST node. The visitor function
transform
accepts the
context
in which current node will be evaluated, andreturns the type of current node.
类型和编程语言作者:Benjamin C. Pierce
Types and Programming Languages by Benjamin C. Pierce
Lambda Ultimate,从此处开始。
Lambda the Ultimate, starting here.