OCaml 和 F# 中的堆栈溢出，但 Haskell 中没有堆栈溢出

发布于 2024-08-22 11:16:36 字数 665 浏览 10 评论 0原文

我一直在比较不同语言执行以下程序的速度：对于 i 从 1 到 1000000 求和乘积 i*(sqrt i)

我的实现之一（不是唯一的）是构造一个列表 [1..1000000]，然后用特定函数折叠。

该程序在 Haskell 中运行良好且快速（即使使用 Foldl 而不是 Foldl' 时也是如此），但在 OCaml 和 F# 中会出现堆栈溢出。

这是 Haskell 代码：

test = foldl (\ a b -> a + b * (sqrt b)) 0

create 0 = []
create n = n:(create (n-1))

main = print (test (create 1000000))

这是 OCaml 代码：

let test = List.fold_left (fun a b -> a +. (float_of_int b) *. (sqrt (float_of_int b))) 0. ;;

let rec create = function
    | 0 -> []
    | n -> n::(create (n-1)) ;;

print_float (test (create 1000000));;

为什么 OCaml/F# 实现堆栈溢出？

原文

I've been comparing for fun different languages for speed in execution of the following program:
for i from 1 to 1000000 sum the product i*(sqrt i)

One of my implementations (not the only one) is constructing a list [1..1000000] and then folding with a specific function.

The program works fine and fast in Haskell (even when using foldl and not foldl') but stack overflows in OCaml and F#.

Here is the Haskell code:

test = foldl (\ a b -> a + b * (sqrt b)) 0

create 0 = []
create n = n:(create (n-1))

main = print (test (create 1000000))

And here is the OCaml one:

let test = List.fold_left (fun a b -> a +. (float_of_int b) *. (sqrt (float_of_int b))) 0. ;;

let rec create = function
    | 0 -> []
    | n -> n::(create (n-1)) ;;

print_float (test (create 1000000));;

Why does the OCaml/F# implementation stack overflows?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

檐上三寸雪 2024-08-29 11:16:36

create 的 Haskell 代码延迟计算列表，即当 foldl 需要元素时。不需要一次性提供整个列表。

相比之下，F# 和 OCaml 严格评估 create 列表，即代码在将其传递给 fold_left 之前尝试一次性生成整个列表。

F# 中的一种可能性是在 create 函数中使用 seq 表达式：这会以与 Haskell 相同的方式延迟生成列表。（我不确定 OCaml 是否具有等效功能。）

回复收藏 0 原文

沧笙踏歌 2024-08-29 11:16:36

首先，如果您要对数字内容进行性能比较，列表并不是最佳选择。尝试像 vector 包这样的包来实现快速数组。

请注意，由于循环融合，您可以在 Haskell 中做得更好。通过将创建函数编写为枚举，编译器可以将创建步骤和折叠循环合并到不分配中间数据结构的单个循环中。像这样进行通用融合的能力是 GHC Haskell 所独有的。

我将使用向量库（基于流的循环融合）：

import qualified Data.Vector as V

test = V.foldl (\ a b -> a + b * sqrt b) 0

create n = (V.enumFromTo 1 n)

main = print (test (create 1000000))

现在，在您的代码之前，编译器无法删除所有列表，我们最终会得到一个内部循环，例如：

$wlgo :: Double# -> [Double] -> Double#

$wlgo =
  \ (ww_sww :: Double#) (w_swy :: [Double]) ->
    case w_swy of _ {
      [] -> ww_sww;
      : x_aoY xs_aoZ ->
        case x_aoY of _ { D# x1_aql ->
        $wlgo
          (+##
             ww_sww (*## x1_aql (sqrtDouble# x1_aql)))
          xs_aoZ
        }
    }

$wcreate :: Double# -> [Double]

$wcreate =
  \ (ww_swp :: Double#) ->
    case ==## ww_swp 0.0 of _ {
      False ->
        :
          @ Double
          (D# ww_swp)
          ($wcreate (-## ww_swp 1.0));
      True -> [] @ Double
    }

注意如何有两个循环：创建生成一个（惰性）列表，以及使用它的折叠。由于懒惰，该列表的成本很便宜，因此它运行得体：

$ time ./C
4.000004999999896e14
./C  0.06s user 0.00s system 98% cpu 0.058 total

然而，在融合下，我们只得到一个循环！

main_$s$wfoldlM_loop :: Double# -> Double# -> Double#

main_$s$wfoldlM_loop =
  \ (sc_sYc :: Double#) (sc1_sYd :: Double#) ->
    case <=## sc_sYc 1000000.5 of _ {
      False -> sc1_sYd;
      True ->
        main_$s$wfoldlM_loop
          (+## sc_sYc 1.0)
          (+##
             sc1_sYd (*## sc_sYc (sqrtDouble# sc_sYc)))

GHC 将我们的创建和测试步骤减少到一个循环中，不使用任何列表。寄存器中只有 2 个双精度数。
循环数量减少一半，运行速度几乎提高一倍：

$ ghc D.hs -Odph -fvia-C -optc-O3 -optc-march=native -fexcess-precision --make
$ time ./D
4.000005000001039e14
./D  0.04s user 0.00s system 95% cpu 0.038 total

这是保证纯度所提供的强大功能的一个很好的例子——编译器可以非常积极地重新排序代码。

Firstly, if you're doing performance comparisons for numeric stuff, lists aren't the best choice. Try a package like the vector package for fast arrays.

And note that you can do even better in Haskell, thanks to loop fusion. By writing the create function as an enumeration the compiler can combine the create step, and the fold loop, into a single loop that allocates no intermediate data structures. The ability to do general fusion like this is unique to GHC Haskell.

I'll use the vector library (stream-based loop fusion):

import qualified Data.Vector as V

test = V.foldl (\ a b -> a + b * sqrt b) 0

create n = (V.enumFromTo 1 n)

main = print (test (create 1000000))

Now, before, with your code, the compiler is unable to remove all lists, and we end up with an inner loop like:

$wlgo :: Double# -> [Double] -> Double#

$wlgo =
  \ (ww_sww :: Double#) (w_swy :: [Double]) ->
    case w_swy of _ {
      [] -> ww_sww;
      : x_aoY xs_aoZ ->
        case x_aoY of _ { D# x1_aql ->
        $wlgo
          (+##
             ww_sww (*## x1_aql (sqrtDouble# x1_aql)))
          xs_aoZ
        }
    }

$wcreate :: Double# -> [Double]

$wcreate =
  \ (ww_swp :: Double#) ->
    case ==## ww_swp 0.0 of _ {
      False ->
        :
          @ Double
          (D# ww_swp)
          ($wcreate (-## ww_swp 1.0));
      True -> [] @ Double
    }

Note how there are two loops: create generating a (lazy) list, and the fold consuming it. Thanks to laziness, the cost of that list is cheap, so it runs in a respectable:

$ time ./C
4.000004999999896e14
./C  0.06s user 0.00s system 98% cpu 0.058 total

Under fusion, however, we get instead a single loop only!

main_$s$wfoldlM_loop :: Double# -> Double# -> Double#

main_$s$wfoldlM_loop =
  \ (sc_sYc :: Double#) (sc1_sYd :: Double#) ->
    case <=## sc_sYc 1000000.5 of _ {
      False -> sc1_sYd;
      True ->
        main_$s$wfoldlM_loop
          (+## sc_sYc 1.0)
          (+##
             sc1_sYd (*## sc_sYc (sqrtDouble# sc_sYc)))

GHC reduced our create and test steps into a single loop with no lists used. Just 2 doubles in registers.
And with half as many loops, it runs nearly twice as fast:

$ ghc D.hs -Odph -fvia-C -optc-O3 -optc-march=native -fexcess-precision --make
$ time ./D
4.000005000001039e14
./D  0.04s user 0.00s system 95% cpu 0.038 total

This is a nice example of the power that guarantees of purity provide -- the compiler can be very aggressive at reording your code.

回复收藏 0 原文

亢潮 2024-08-29 11:16:36

修复代码以使其在 F# 中工作的另一种方法是使用也是延迟生成的序列表达式，并且以不会导致 StackOverflow 的方式生成（这在 OCaml 中不起作用，因为序列表达式是 F# 特定的）。这是代码：

let test nums = Seq.fold (fun a b -> 
  a + (float b) * (sqrt (float b))) 0.0 nums

let rec create count = seq {
  match count with
  | 0 -> do ()
  | n -> yield n
         yield! create (n-1) }

printf "%f" (test (create 1000000));;

我不确定性能，但是编译器肯定在 create 函数中进行了优化，因此迭代生成的序列应该相当快。然而，序列表达式的目标主要是可读性（因为 F# 不是纯粹的，如果您确实需要它们，它为您提供了大量的手动优化空间，而不是 Haskell 需要优化您编写的纯代码）。不管怎样，如果你有一些测试，你可以尝试一下！

Another way to fix the code so that it works in F# is to use sequence expressions that are also generated lazily and in a way that doesn't cause StackOverflow (this won't work in OCaml, because sequence expressions are F# specific). Here is the code:

let test nums = Seq.fold (fun a b -> 
  a + (float b) * (sqrt (float b))) 0.0 nums

let rec create count = seq {
  match count with
  | 0 -> do ()
  | n -> yield n
         yield! create (n-1) }

printf "%f" (test (create 1000000));;

I'm not sure about the performance, however the compiler definitely does an optimization in the create function, so that iterating over the generated sequence should be reasonably fast. However, the aim of sequence expressions is mainly readability (since F# is not pure, it gives you a lot of space for manual optimizations if you really need them as opposed to Haskell which needs to optimize the pure code you write). Anyway, if you have some tests, you can give it a try!

回复收藏 0 原文

美人迟暮 2024-08-29 11:16:36

在这种形式中，您的 create 函数不是尾递归的。您可以以不会导致堆栈溢出的尾递归形式重写它：

let rec create n l =
    match n with 
    | 0 -> l
    | n -> create (n-1) (n::l)

print_float (test (create 1000000 []));;

In this form your create function is not tail-recursive. You can rewrite it in a tail recursive form that doesn't cause a stack overflow:

let rec create n l =
    match n with 
    | 0 -> l
    | n -> create (n-1) (n::l)

print_float (test (create 1000000 []));;

回复收藏 0 原文

相对绾红妆 2024-08-29 11:16:36

该程序在 Haskell 中运行良好且快速（即使使用 Foldl 而不是 Foldl' 时也是如此），但在 OCaml 和 F# 中会出现堆栈溢出。

您的 Haskell 使用惰性列表，但您的 OCaml/F# 使用严格列表，因此您的程序是无可比拟的。

FWIW，在 F# 中使用按需序列的解决方案是：

Seq.sumBy (fun i -> let i = float i in i * sqrt i) (seq {1..1000000})

The program works fine and fast in Haskell (even when using foldl and not foldl') but stack overflows in OCaml and F#.

Your Haskell is using lazy lists but your OCaml/F# is using strict lists, so your programs are incomparable.

FWIW, a solution using on-demand sequences in F# is:

Seq.sumBy (fun i -> let i = float i in i * sqrt i) (seq {1..1000000})

回复收藏 0 原文

~没有更多了~

关于作者

江湖正好

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

OCaml 和 F# 中的堆栈溢出，但 Haskell 中没有堆栈溢出

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

峩卟喜欢

一向肩并

潜伏

dongyinghao

百变从容

沧笙踏歌

友情链接

OCaml 和 F# 中的堆栈溢出，但 Haskell 中没有堆栈溢出

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

峩卟喜欢

一向肩并

潜伏

dongyinghao

百变从容

沧笙踏歌

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。