实现类型推断

发布于 2024-07-10 11:51:06 字数 303 浏览 13 评论 0原文

我在这里看到一些关于静态类型与动态类型的有趣讨论。 我通常更喜欢静态类型,因为编译类型检查、更好的文档化代码等。但是,我确实同意,例如,如果按照 Java 的方式完成,它们确实会使代码变得混乱。

因此,我即将开始构建自己的函数式语言,而类型推断是我想要实现的事情之一。 我确实明白这是一个很大的主题,我并不是想创造一些以前没有做过的东西,只是基本的推理......

有关阅读什么内容的任何指示将有助于我解决这个问题? 最好是更务实/实用的东西,而不是更理论的范畴论/类型理论文本。 如果有一个带有数据结构/算法的实现讨论文本,那就太好了。

I see some interesting discussions here about static vs. dynamic typing.
I generally prefer static typing, due to compile type checking, better documented code, etc. However, I do agree that they do clutter up the code if done the way Java does it, for example.

So I'm about to start building a functional style language of my own, and type inference is one of the things that I want to implement. I do understand that it is a big subject, and I'm not trying to create something that has not been done before, just basic inferencing...

Any pointers on what to read up that will help me with this? Preferably something more pragmatic/practical as opposed to more theoretical category theory/type theory texts. If there's an implementation discussion text out there, with data structures/algorithms, that would just be lovely.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

鸩远一方 2024-07-17 11:51:06

我发现以下资源有助于理解类型推断(按难度递增的顺序排列):

  1. 免费图书 PLAI 的第 30 章(类型推断) 编程语言:应用程序和解释,草图基于统一的类型推断。
  2. 暑期课程将类型解释为抽象值呈现优雅使用 Haskell 作为元语言的求值器、类型检查器、类型重构器和推理器。
  3. EOPL 一书的第 7 章(类型),编程语言要点
  4. TAPL 书的第 22 章(类型重构),类型与编程语言,以及相应的 OCaml 实现recon 和 fullrecon
  5. 新书 DCPL 的第 13 章(类型重构),编程语言中的设计概念
  6. 学术论文精选
  7. 闭包编译器的 TypeInference 是数据流分析方法的示例类型推断,这更适合 Hindler Milner 方法的动态语言。

然而,由于最好的学习方法就是实践,我强烈建议通过完成编程语言课程的家庭作业来实现玩具函数式语言的类型推断。

我推荐这两个易于理解的 ML 作业,您可以在一天之内完成它们:

  1. PCF 解释器解决方案)暖身。
  2. PCF 类型推断 (解决方案)用于实现 Hindley-Milner 类型推断的算法 W。

这些作业来自更高级的课程:

  1. < p>实现 MiniML
  2. 多态、存在、递归类型 (PDF)

  3. 双向类型检查 (PDF)

  4. 子类型和对象 (PDF)

I found the following resources helpful for understanding type inference, in order of increasing difficulty:

  1. Chapter 30 (Type Inference) of the freely available book PLAI, Programming Languages: Application and Interpretation, sketches unification-based type inference.
  2. The summer course Interpreting types as abstract values presents elegant evaluators, type checkers, type reconstructors and inferencers using Haskell as a metalanguage.
  3. Chapter 7 (Types) of the book EOPL, Essentials of Programming Languages.
  4. Chapter 22 (Type Reconstruction) of the book TAPL, Types and Programming Languages, and the corresponding OCaml implementations recon and fullrecon.
  5. Chapter 13 (Type Reconstruction) of the new book DCPL, Design Concepts in Programming Languages.
  6. Selection of academic papers.
  7. Closure compiler's TypeInference is an example of the data-flow analysis approach to type inference, which is better suited to dynamic languages that the Hindler Milner approach.

However, since the best way to learn is to do, I strongly suggest implementing type inference for a toy functional language by working through a homework assignment of a programming languages course.

I recommend these two accessible homeworks in ML, which you can both complete in less than a day:

  1. PCF Interpreter (a solution) to warm up.
  2. PCF Type Inference (a solution) to implement algorithm W for Hindley-Milner type inference.

These assignments are from a more advanced course:

  1. Implementing MiniML

  2. Polymorphic, Existential, Recursive Types (PDF)

  3. Bi-Directional Typechecking (PDF)

  4. Subtyping and Objects (PDF)

执笔绘流年 2024-07-17 11:51:06

不幸的是,关于这个主题的许多文献都非常丰富。 我也处在你的处境中。 我对这个主题的第一次介绍来自《编程语言:应用程序和解释》

http://www.plai.org/

我将尝试总结抽象的想法,然后总结我没有立即发现的明显细节。 首先,类型推断可以被认为是生成然后解决约束。 要生成约束,您可以递归遍历语法树并在每个节点上生成一个或多个约束。 例如,如果节点是+运算符,则操作数和结果必须都是数字。 应用函数的节点与函数的结果具有相同的类型,依此类推。

对于没有let的语言,可以盲目地通过替换来解决上述约束。 例如:

(if (= 1 2) 
    1 
    2)

这里,我们可以说if语句的条件必须是布尔型,并且if语句的类型与其thenelse子句。 由于我们知道 12 是数字,通过替换,我们知道 if 语句是一个数字。

事情变得令人讨厌的地方,以及我有一段时间无法理解的地方,是处理 let:

(let ((id (lambda (x) x)))
    (id id))

在这里,我们将 id 绑定到一个函数,该函数返回您传入的任何内容,也称为恒等函数。 问题在于,每次使用 id 时,函数参数 x 的类型都不同。 第二个id是一个a ->; 类型的函数。 a,其中 a 可以是任何内容。 第一个是 (a -> a) -> 类型 (a -> a)。 这称为let 多态性。 关键是按特定顺序解决约束:首先解决 id 定义的约束。 这将是 a -> 一个。 然后,可以将 id 类型的新的单独副本替换为使用 id 的每个位置的约束,例如 a2 -> a2 和 a3 -> a3

在线资源中并没有轻易解释这一点。 他们会提到算法 W 或 M,但不会提到它们在解决约束方面的工作原理,或者为什么它不会拒绝 let 多态性:这些算法中的每一个都在解决约束时强制执行排序。

我发现这个资源对于将算法 W、M 以及约束生成和求解的一般概念联系在一起非常有帮助。 它有点密集,但比许多更好:

http ://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf

许多其他论文也很好:

http://people.cs.uu.nl/bastiaan/papers.html

我希望这有助于澄清一个有些模糊的世界。

It's unfortunate that much of the literature on the subject is very dense. I too was in your shoes. I got my first introduction to the subject from Programming Languages: Applications and Interpretation

http://www.plai.org/

I'll try to summarize the abstract idea followed by details that I did not find immediately obvious. First, type inference can be thought of generating and then solving constraints. To generate constraints, you recurse through the syntax tree and generate one or more constraints on each node. For example, if the node is a + operator, the operands and the results must all be numbers. A node that applies a function has the same type as the result of the function, and so on.

For a language without let, you can blindly solve the above constraints by substitution. For example:

(if (= 1 2) 
    1 
    2)

here, we can say that the condition of the if statement must be Boolean, and that the type of the if statement is the same as the type of its then and else clauses. Since we know 1 and 2 are numbers, by substitution, we know the if statement is a number.

Where things get nasty, and what I couldn't understand for a while, is dealing with let:

(let ((id (lambda (x) x)))
    (id id))

Here, we've bound id to a function that returns whatever you've passed in, otherwise known as the identity function. The problem is the type of the function's parameter x is different on each usage of id. The second id is a function of type a -> a, where a can be anything. The first is of type (a -> a) -> (a -> a). This is known as let-polymorphism. The key is to solve constraints in a particular order: first solve constraints for the definition of id. This will be a -> a. Then fresh, separate copies of the type of id can be substituted into the constraints for each place id is used, for example a2 -> a2 and a3 -> a3.

That wasn't readily explained in online resources. They'll mention algorithm W or M but not how they work in terms of solving constraints, or why it doesn't barf on let-polymorphism: each of those algorithms enforce an ordering on solving the constraints.

I found this resource extremely helpful to tie Algorithm W, M, and the general concept of constraint generation and solving all together. It's a little dense, but better than many:

http://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf

Many of the other papers there are nice too:

http://people.cs.uu.nl/bastiaan/papers.html

I hope that helps clarify a somewhat murky world.

热鲨 2024-07-17 11:51:06

除了函数式语言的 Hindley Milner 之外,另一位
动态语言类型推断的流行方法是
抽象解释

抽象解释的思想是写一个特殊的
语言的解释器,而不是保持一个环境
具体值(1,假,闭包),它适用于抽象值,又名
类型(int、bool 等)。 因为它正在解释程序
抽象值,这就是为什么它被称为抽象解释。

Pysonar2 是抽象解释的优雅实现
对于Python。 Google用它来分析Python
项目。 基本上它使用访问者模式来调度
对相关 AST 节点的评估调用。 访问者函数 转换
接受 上下文 将在其中评估当前节点,以及
返回当前节点的类型。

In addition to Hindley Milner for functional languages, another
popular approach to type inference for dynamic language is
abstract interpretation.

The idea of abstract interpretation is to write a special
interpreter for the language, instead of keep an environment of
concrete values(1, false, closure), it works on abstract values, aka
types(int, bool, etc). Since it's interpreting the program on
abstract values, that's why it's called abstract interpretation.

Pysonar2 is an elegant implementation of abstract interpretation
for Python. It is used by Google to analyze Python
projects. Basically it uses visitor pattern to dispatch
evaluation call to relevant AST node. The visitor function transform
accepts the context in which current node will be evaluated, and
returns the type of current node.

注定孤独终老 2024-07-17 11:51:06

类型和编程语言作者:Benjamin C. Pierce

Types and Programming Languages by Benjamin C. Pierce

失眠症患者 2024-07-17 11:51:06

Lambda Ultimate,从此处开始。

Lambda the Ultimate, starting here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文