未装箱类型和严格性之间有什么关系?

发布于 2024-09-07 11:05:40 字数 171 浏览 6 评论 0原文

未装箱的类型(例如 Int#)和严格函数(例如 f (!x) = ...)是不同的,但我看到概念上的相似性 - 它们不允许 thunks/在某种程度上的懒惰。如果 Haskell 是像 Ocaml 一样严格的语言,那么每个函数都将是严格的,并且每个类型都将被拆箱。未装箱类型和执行严格性之间有什么关系?

Unboxed types, like Int#, and strict functions, like f (!x) = ..., are something different, but I see conceptual similarity - they disallow thunks/laziness in some way. If Haskell was a strict language like Ocaml, every function would be strict and every type unboxed. What is the relationship between unboxed types and enforcing strictness?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

似梦非梦 2024-09-14 11:05:41

未装箱数据与装箱数据

支持参数多态性懒惰,默认情况下,Haskell 数据类型统一表示为指向 关闭 ,结构如下:

替代文本
(来源:haskell.org)

这些是“盒装”值。未装箱的对象由值本身直接表示,没有任何间接或闭包。 Int 已装箱,但 Int# 未装箱。

惰性值需要一个盒装表示。严格值则不然:它们可以表示为堆上完全评估的闭包,也可以表示为原始的未装箱结构。请注意 指针标记是一种优化,我们可以在装箱对象上使用它,将构造函数编码在指向闭包的指针中。

与严格性的关系

通常,未装箱的值是由函数语言编译器以临时方式生成的。然而,在 Haskell 中,未装箱的值是特殊的。他们:

  1. 他们有不同的种类,#
  2. 只能在特殊场所使用;并且
  3. 它们未被提升,因此不表示为指向堆值的指针。

因为他们没有提升,所以必然是严格的。懒惰的表现是不可能的。

因此,特定的未装箱类型,例如 Int#Double#,实际上在机器上表示为 double 或 int(以 C 表示法)。

严格性分析

另外,GHC 会严格性分析。如果发现一个值的使用是严格的——即它永远不能“未定义”——优化器可能会将常规类型(例如Int)的所有使用替换为未装箱的类型(Int#),因为它知道 Int 的使用始终是严格的,因此用更高效(且始终严格)的类型 Int# 替换是安全的。

当然,我们可以有严格类型而无需拆箱类型,例如,元素严格多态列表:

data List a = Empty | Cons !a (List a)

其元素是严格的,但不将它们表示为拆箱值。

这也指出了您在严格语言方面犯的错误, 像 OCaml。它们仍然需要支持多态性,因此要么提供统一的表示,要么将数据类型和函数专门化为每种类型。 GHC 默认情况下使用统一表示,就像 OCaml 一样,尽管 GHC 也可以专门化类型和函数现在(就像 C++ 模板一样)。

Unboxed vs Boxed Data

To support parametric polymorphism and laziness, by default Haskell data types are represented uniformly as a pointer to a closure on the heap, with a structure like this:

alt text
(source: haskell.org)

These are "boxed" values. An unboxed object is represented by the value itself directly, without any indirection or closure. Int is boxed, but Int# is unboxed.

Lazy values require a boxed representation. Strict values do not: they can represented either as fully evaluated closures on the heap, or as primitive unboxed structures. Note that pointer tagging is an optimization that we can use on boxed objects, to encode the constructor in the pointer to the closure.

The Relationship to Strictness

Normally, unboxed values are generated in an ad hoc fashion by functional language compilers. In Haskell, however, unboxed values are special. They:

  1. they have a different kind, #;
  2. can only be used in special places; and
  3. they're unlifted, so are not represented as a pointer to a heap value.

Because they are unlifted they are necessarily strict. The representation of laziness is not possible.

So particular unboxed types, like Int#, Double#, really are represented just as double or int on the machine (in C notation).

Strictness Analysis

Separately, GHC does strictness analysis of regular Haskell types. If a value's use is found to be strict – i.e. it can never be 'undefined' – the optimizer might replace all uses of the regular type (e.g. Int) with an unboxed one (Int#), since it knows that the use of Int is always strict, and thus replacement with the more efficient (and always strict) type Int# is safe.

We can of course have strict types without unboxed types, for example, an element-strict polymorphic list:

data List a = Empty | Cons !a (List a)

is strict in its elements, but does not represent them as unboxed values.

This also points out the mistake you made about strict languages, like OCaml. They still need to support polymorphism, so either they provide a uniform representation, or they specialize data types and functions to every type. GHC by default uses uniform representation, as does OCaml, though GHC can also specialize types and functions now (like C++ templates).

淡莣 2024-09-14 11:05:41

拆箱类型必然是严格的,但并非所有严格值都必须拆箱。

data Foo a = Foo !a !a

有两个严格字段

data Bar a = Bar {-# UNPACK #-} !Int !a

有两个严格字段,但第一个未装箱。

最终,未装箱类型(必然)严格的原因是没有地方存储 thunk,因为此时它们只是扁平的、愚蠢的数据。

Unboxed types are necessarily strict, but not all strict values are necessarily unboxed.

data Foo a = Foo !a !a

has two strict fields

data Bar a = Bar {-# UNPACK #-} !Int !a

has two strict fields, but the first one is unboxed.

Ultimately, the reason unboxed types are (necessarily) strict is there is no place to store the thunk as they are just flat, dumb data at that point.

平安喜乐 2024-09-14 11:05:41

任何类型的参数都可以设置为“严格”,但具有相应装箱类型的唯一未装箱类型是 Char#Int#Word#Double#Float#

如果您了解 C 等低级语言,则更容易解释。未装箱类型如 intdouble 等,装箱类型如 int*double* 等等等。当你有一个 int 时,你已经知道它以位模式表示的整个值,因此,它不是懒惰的。它也必须是严格的,因为 int 的所有值都是有效的,但不是 ⊥。

但是,给定一个 int* ,您可以选择稍后取消引用该指针以获取实际值(因此是惰性的),并且可能存在无效指针(它包含 ⊥,即非严格)。

Arguments of any types can be made "strict", but the only unboxed types that have corresponding boxed types are Char#, Int#, Word#, Double# and Float#.

If you know low-level languages like C, it's easier to explain. Unboxed types are like int, double, etc., and the boxed types are like int*, double*, etc. When you've got an int, you already know the whole value as it's represented in the bit pattern, therefore, it is not lazy. It must be strict too, as all values of int are valid and not ⊥.

However, given an int* you may choose to dereference the pointer later to get the actual value (thus lazy), and it is possible to have invalid pointers (it contains ⊥, i.e. non-strict).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文