数据类型中严格字段的优点
现在这可能有点模糊,但我已经想知道这一点有一段时间了。据我所知,使用 !
,可以确保在构造值之前对数据构造函数的参数进行求值:
data Foo = Bar !Int !Float
我经常认为懒惰是一件好事。现在,当我浏览源代码时,我发现严格字段比 !
-less 变体更常见。
这样做有什么好处?为什么我不应该让它保持懒惰呢?
This may now be a bit fuzzy, but I've been wondering that for a while. To my knowledge with !
, one can make sure a parameter for a data constructor is being evaluated before the value is constructed:
data Foo = Bar !Int !Float
I have often thought that laziness is a great thing. Now, when I go through sources, I see strict fields more often than the !
-less variant.
What is the advantage of this and why shouldn't I leave it lazy as it is?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
除非您在 Int 和 Float 字段中存储大量计算,否则在 thunk 中构建的大量琐碎计算可能会产生大量开销。例如,如果您重复将 1 添加到数据类型中的惰性 Float 字段,它将占用越来越多的内存,直到您实际强制该字段并计算它。
通常,您希望在字段中存储昂贵的计算。但是,如果您知道自己不会提前做类似的事情,则可以将该字段标记为严格,并避免在任何地方手动添加
seq
以获得您想要的效率。作为额外的好处,当给定标志
-funbox-strict-fields
时,GHC 会将数据类型的严格字段1直接解压到数据类型本身中,这是可能的,因为它知道它们总是会被评估,因此不需要分配 thunk ;在这种情况下,Bar 值将直接在内存中的 Bar 值内包含由 Int 和 Float 组成的机器字,而不是包含两个指向包含数据的 thunk 的指针。惰性是一件非常有用的事情,但有时,它只会妨碍计算,特别是对于总是被查看(因此被迫)的小字段,或者经常修改但从不进行非常昂贵的计算的小字段。严格字段有助于克服这些问题,而无需修改数据类型的所有用途。
它是否比惰性字段更常见取决于您正在阅读的代码类型;例如,您不太可能看到任何功能树结构广泛使用严格字段,因为它们从惰性中受益匪浅。
假设您有一个带有用于中缀操作的构造函数的 AST:
您不想使
Exp
字段变得严格,因为应用这样的策略意味着每当您查看时都会评估整个 AST顶层节点,这显然不是你想要从懒惰中获益的。然而,Op
字段永远不会包含您想要推迟到以后的昂贵计算,并且如果您有真正深层嵌套的解析,每个中缀运算符的 thunk 开销可能会变得昂贵树。因此,对于中缀构造函数,您希望使Op
字段严格,但将两个Exp
字段保留为惰性。1 只能解包单构造函数类型。
Unless you're storing a large computation in the Int and Float fields, significant overhead can build up from lots of trivial computations building up in thunks. For instance, if you repeatedly add 1 to a lazy Float field in a data type, it will use up more and more memory until you actually force the field, calculating it.
Often, you want to store to expensive computation in a field. But if you know you won't be doing anything like that ahead of time, you can mark the field strict, and avoid having to manually add
seq
everywhere to get the efficiency you desire.As an additional bonus, when given the flag
-funbox-strict-fields
GHC will unpack strict fields1 of data types directly into the data type itself, which is possible since it knows they will always be evaluated, and thus no thunk has to be allocated; in this case, a Bar value would contain the machine words comprising the Int and Float directly inside the Bar value in memory, rather than containing two pointers to thunks which contain the data.Laziness is a very useful thing, but some of the time, it just gets in the way and impedes computation, especially for small fields that are always looked at (and thus forced), or that are modified often but never with very expensive computations. Strict fields help overcome these issues without having to modify all uses of the data type.
Whether it's more common than lazy fields or not depends on the type of code you're reading; you aren't likely to see any functional tree structures use strict fields extensively, for instance, because they benefit greatly from laziness.
Let's say you have an AST with a constructor for infix operations:
You wouldn't want to make the
Exp
fields strict, as applying a policy like that would mean that the entire AST is evaluated whenever you look at the top-level node, which is clearly not what you want to benefit from laziness. However, theOp
field is never going to contain an expensive computation that you want to defer to a later date, and the overhead of a thunk per infix operator might get expensive if you have really deeply-nested parse trees. So for the infix constructor, you'd want to make theOp
field strict, but leave the twoExp
fields lazy.1 Only single-constructor types can be unpacked.
除了其他答案提供的信息之外,请记住:
看看 深入评估参数 - 就像
seq
和$! 评估为WHNF。
给定数据类型
,计算为 WHNF 的表达式
将生成值
IntFoo 6
(== 完全计算,== NF)。此外,此表达式
计算为 WHNF 会生成值
FooFoo (IntFoo 6)
(== 完全计算,== NF)。但是,此表达式
计算为 WHNF 会生成值
BarFoo (IntBar (1 + 2 + 3))
(!= 完全计算,!= NF)。要点:如果
Bar
的数据构造函数本身不包含严格的参数,那么!Bar
参数的严格性不一定有帮助。In addition to the information provided by other answers, keep in mind:
It's interesting to look at how deep the parameter is evaluated - it's like with
seq
and$!
evaluated to WHNF.Given the datatypes
the expression
evaluated to WHNF produces value
IntFoo 6
(== fully evaluated, == NF).Additionally this expression
evaluated to WHNF produces value
FooFoo (IntFoo 6)
(== fully evaluated, == NF).However, this expression
evaluated to WHNF produces value
BarFoo (IntBar (1 + 2 + 3))
(!= fully evaluated, != NF).Main point: The strictness of the
!Bar
parameter won't necessarily help if the data constructors ofBar
don't contain strict parameters themselves.惰性会带来一定的开销——编译器必须为值创建一个 thunk 来存储计算,直到需要结果为止。如果您知道迟早总会需要结果,那么强制对结果进行评估是有意义的。
There is an overhead associated with lazyness — the compiler has to create a thunk for the value to store the computation until the result is needed. If you know that you'll always need the result sooner or later, then it can make sense to force the evaluation of the result.
懒惰是有代价的,否则每种语言都会有它。
成本是两倍:
Lazyness comes at a cost, otherwise every language would have it.
The cost is 2-fold: