当前位置：文江博客话题详情

haskell

具有类约束类型的值实际上在运行时是一个函数吗？

发布于 2024-12-08 06:32:21 字数 1028 浏览 0 评论 0 原文

考虑一下著名的

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

假设，为了避免单态限制，它被注释为：

fibs :: Num a => [a]

这似乎意味着在运行时，列表值fibs并不真正存在，而是一个重新计算列表的函数每次选择 fibs 的元素时？

问题是在您所知道的不同 Haskell 实现中实际上如何处理此类情况。

--- 增加 ---- 我觉得我必须再详细一点。考虑：

fibsInteger :: [Integer]
fibsInteger = 0: 1: zipWith (+) fibsInteger (tail fibsInteger)

该值

(fibsInteger !! 42)

并假设在程序执行期间需要评估。在这种情况下，我希望后续的类似评估会发现 fibsInteger 的前 43 个元素已经被评估。这还意味着 fibsInteger 本身及其前 42 个尾部已经在 WHNF 中。

然而，据我所知，对于多态的fib来说这是不可能的。 FUZxxl 的评论

因为类型类通常会引入一个新参数，其中包含具有该类型类函数的字典

似乎支持我的观点，即像 fibs 这样的值在运行时有效地显示为函数？

如果是这样，那么像 ((maximum .map (fibs!!)) [100000 .. 101000] :: Integer) 这样的应用程序比非多态变体需要明显更长的时间来评估<代码>((最大.map(fibsInteger!!)) [100000 .. 101000] :: 整数)因为前 100000 个数字每次都必须重新计算。（不幸的是，我现在不能尝试这个）

原文

Consider the famous

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

Suppose that, to avoid the monomorphism restriction, this is annotated with:

fibs :: Num a => [a]

This seems to imply that at runtime, a list value fibs does not really exist, but rather a function that computes the list anew each time an element of fibs is picked?

The question is how such cases are actually handled in different Haskell implementations you know of.

--- ADDED ----
I feel that I must elaborate a bit more. Consider:

fibsInteger :: [Integer]
fibsInteger = 0: 1: zipWith (+) fibsInteger (tail fibsInteger)

and suppose during program execution the value

(fibsInteger !! 42)

needs to be evaluated. In that case I would expect that subsequent evaluations like that would find that the first 43 elements of fibsInteger are already evaluated. This implies also that fibsInteger itself and the first 42 tails of it are already in WHNF.

Yet that would not be possible with a polymorphic fibs, as far as I can see. FUZxxl's remark

because a typeclass usually introduces a new argument containing a
dictionary with the functions of that typeclass

seems to support my view that a value like fibs effectively appears as function at runtime?

If this were so, then an application like ((maximum . map (fibs!!)) [100000 .. 101000] :: Integer) shold take noticeably longer to evaluate than the non-polymorphic variant ((maximum . map (fibsInteger!!)) [100000 .. 101000] :: Integer) because the first 100000 numbers will have to be recomputed each time.
(Unfortunately, I can't try this out at this time)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

他不在意 2024-12-15 06:32:21

这取决于实施。在 GHC 中，类型类是使用字典来实现的。假设 Num 类的定义如下（本示例进行了简化）：

class Num a where
    fromInteger :: Integer -> a
    (+) :: a -> a -> a

然后它将被编译为“字典”数据类型：

data Num a = Num { fromInteger :: Integer -> a, plus :: a -> a -> a }

任何具有 Num 约束的内容都会得到字典的一个额外参数，例如 foo x = x + 1 将变为：

foo :: Num a -> a -> a
foo num x = plus num x (fromInteger num 1)

那么让我们看看 GHC 如何编译 fibs，好吗？

$ cat Fibs.hs
module Fibs where
fibs :: Num a => [a]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
$ ghc -c Fibs.hs -ddump-simpl

==================== Tidy Core ====================
Rec {
Fibs.fibs [Occ=LoopBreaker]
  :: forall a_abu. GHC.Num.Num a_abu => [a_abu]
[GblId, Arity=1]
Fibs.fibs =
  \ (@ a_akv) ($dNum_akw :: GHC.Num.Num a_akv) ->
    GHC.Types.:
      @ a_akv
      (GHC.Num.fromInteger
         @ a_akv $dNum_akw (GHC.Integer.smallInteger 0))
      (GHC.Types.:
         @ a_akv
         (GHC.Num.fromInteger
            @ a_akv $dNum_akw (GHC.Integer.smallInteger 1))
         (GHC.List.zipWith
            @ a_akv
            @ a_akv
            @ a_akv
            (GHC.Num.+ @ a_akv $dNum_akw)
            (Fibs.fibs @ a_akv $dNum_akw)
            (GHC.List.tail @ a_akv (Fibs.fibs @ a_akv $dNum_akw))))
end Rec }

如果你稍微眯一下眼睛，这本质上就是这样，

fibs :: Num a -> [a]
fibs num = fromInteger num 0
         : fromInteger num 1
         : zipWith (plus num) (fibs num) (tail (fibs num))

所以对于 GHC 来说，答案是肯定的。正如您所怀疑的，这可能会对性能产生巨大影响，因为这会破坏此定义所依赖的 fib 共享，以至于您获得指数运行时间而不是线性¹。

Prelude Fibs> :set +s
Prelude Fibs> fibs !! 30
832040
(3.78 secs, 912789096 bytes)

我们可以通过引入自己的共享来解决这个问题：

module SharedFibs where
fibs :: Num a => [a]
fibs = let f = 0 : 1 : zipWith (+) f (tail f) in f

这样要好得多。

Prelude SharedFibs> :set +s
Prelude SharedFibs> fibs !! 30
832040
(0.06 secs, 18432472 bytes)
Prelude SharedFibs> fibs !! 100000
<huge number>
(2.19 secs, 688490584 bytes)

但它仍然存在同样的问题，即 fibs 不在单独的调用之间共享。如果您想要这样，您必须在 let 或 where 中将 fibs 专门化为您想要的数字类型。

这些令人惊讶的性能是可怕的单态限制存在的部分原因。

_{¹ 忽略Integer 加法不是恒定时间这一事实。}

It depends on the implementation. In GHC, type classes are implemented using dictionaries. Let's say the Num class was defined like this (simplified for this example):

class Num a where
    fromInteger :: Integer -> a
    (+) :: a -> a -> a

Then it will be compiled as a "dictionary" data type:

data Num a = Num { fromInteger :: Integer -> a, plus :: a -> a -> a }

Anything with a Num constraint will get an extra argument for the dictionary, so for example foo x = x + 1 will become:

foo :: Num a -> a -> a
foo num x = plus num x (fromInteger num 1)

So let's see how GHC compiles fibs, shall we?

$ cat Fibs.hs
module Fibs where
fibs :: Num a => [a]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
$ ghc -c Fibs.hs -ddump-simpl

==================== Tidy Core ====================
Rec {
Fibs.fibs [Occ=LoopBreaker]
  :: forall a_abu. GHC.Num.Num a_abu => [a_abu]
[GblId, Arity=1]
Fibs.fibs =
  \ (@ a_akv) ($dNum_akw :: GHC.Num.Num a_akv) ->
    GHC.Types.:
      @ a_akv
      (GHC.Num.fromInteger
         @ a_akv $dNum_akw (GHC.Integer.smallInteger 0))
      (GHC.Types.:
         @ a_akv
         (GHC.Num.fromInteger
            @ a_akv $dNum_akw (GHC.Integer.smallInteger 1))
         (GHC.List.zipWith
            @ a_akv
            @ a_akv
            @ a_akv
            (GHC.Num.+ @ a_akv $dNum_akw)
            (Fibs.fibs @ a_akv $dNum_akw)
            (GHC.List.tail @ a_akv (Fibs.fibs @ a_akv $dNum_akw))))
end Rec }

If you squint a little, this is essentially

fibs :: Num a -> [a]
fibs num = fromInteger num 0
         : fromInteger num 1
         : zipWith (plus num) (fibs num) (tail (fibs num))

So for GHC, the answer is yes. As you suspected, this can have drastic implications on performance, as this destroys the sharing of fibs that this definition relies on, to the point where you get exponential runtime instead of linear¹.

Prelude Fibs> :set +s
Prelude Fibs> fibs !! 30
832040
(3.78 secs, 912789096 bytes)

We can fix this problem by introducing sharing ourselves:

module SharedFibs where
fibs :: Num a => [a]
fibs = let f = 0 : 1 : zipWith (+) f (tail f) in f

This is much better.

Prelude SharedFibs> :set +s
Prelude SharedFibs> fibs !! 30
832040
(0.06 secs, 18432472 bytes)
Prelude SharedFibs> fibs !! 100000
<huge number>
(2.19 secs, 688490584 bytes)

But it still has the same problem that fibs is not shared between separate calls. If you want this, you will have to specialize fibs to your desired number type in a let or where.

These sort of performance surprises is part of the reason why the dreaded monomorphism restriction exists.

_{¹ Ignoring the fact that Integer addition is not constant time.}

回复收藏 0 原文

她比我温柔 2024-12-15 06:32:21

多态性会带来额外的性能负担（我认为这就是您要问的问题）。在托马斯对这个问题的回答< /a>，使类型成为非多态，将运行时间从 36 秒减少到 11 秒。

您的声明：

这似乎意味着在运行时，列表值 fibs 并不真正存在，而是一个每次选取 fibs 元素时重新计算列表的函数？

我不太确定你在这里的意思 - 你似乎意识到它很懒。您可能会问 Haskell 是否认为这是一个“函数声明”或“值声明” - 您可以尝试使用 Template Haskell：

> runQ [d| fib = 0 : 1 : zipWith (+) fib (tail fib) |]
[ValD (VarP fib) ...

所以它是一个值声明（ValD）。

Polymorphism can bring an additional performance burden (which I think is the question you are asking). In Thomas' answer to this question, making the type non-polymorphic reduced runtime from 36 to 11 seconds.

Your statement:

This seems to imply that at runtime, a list value fibs does not really exist, but rather a function that computes the list anew each time an element of fibs is picked?

I'm not really sure what you mean here - you seem to be aware that it's lazy. You might be asking if Haskell considers this a "function declaration" or a "value declaration" - you can try using Template Haskell:

> runQ [d| fib = 0 : 1 : zipWith (+) fib (tail fib) |]
[ValD (VarP fib) ...

So it is a value declaration (ValD).

回复收藏 0 原文

别在捏我脸啦 2024-12-15 06:32:21

首先，列表是无限的，因此在程序运行之前不可能生成整个列表。正如 MatrixFrog 已经指出的那样，fibs 是一个thunk。您可以粗略地将 thunk 想象为一个不带参数并返回值的函数。唯一的区别是，指向函数的指针随后被指向结果的指针替换，导致结果被缓存。仅当函数不依赖于任何类型类时才会发生这种情况，因为类型类通常会引入一个新参数，其中包含具有该类型类函数的字典（此过程有时称为具体化）。

很长一段时间我发布了对此的答案 codegolf.SE 问题包含 C 中 thunk 的自己实现。代码不是很好，列表内容与 thunk 本身没有很好地分离，但值得一看。

回复收藏 0 原文

此刻的回忆 2024-12-15 06:32:21

函数总是涉及 (->) 类型构造函数，因此它不是函数。这是一个价值。函数也是值，但值不是函数，这与惰性无关。函数的关键属性是您可以应用它。应用程序具有类型：

(a -> b) -> a -> b

当然，它是一个惰性值，并且在实现级别上涉及称为 thunk 的东西，但这在很大程度上与您的问题无关。 thunk 是一个实现细节。仅仅因为它是一个延迟计算的值并不能将其变成一个函数。不要将评估与执行混淆！ C 等语言中的函数与 Haskell 中的函数不同。 Haskell 使用函数的真正数学概念，这与在机器级别上执行的策略完全无关。

A function always involves the (->) type constructor, so it is not a function. It's a value. Functions are also values, but values are not functions, and that has nothing to do with laziness. The key property of a function is that you can apply it. Application has the type:

(a -> b) -> a -> b

Of course it is a lazy value and on the implementation level involves something called a thunk, but this is largely irrelevant to your question. Thunks are an implementation detail. Just because it is a lazily computed value doesn't turn it into a function. Don't confuse evaluation with execution! Functions in a language like C are not the same as function in Haskell. Haskell uses the real mathematical notion of a function, which is completely unrelated to with which strategy things are executed on the machine level.

回复收藏 0 原文

~没有更多了~