为什么foldl'使用大量的RAM与复杂的数据结构?

发布于 2025-01-17 08:32:10 字数 484 浏览 2 评论 0原文

懒惰的折叠使用了很多RAM。在data.list中,foldl'提供了使用严格评估的左折。例如,以下计算1000万个零的总和,而RAM使用率几乎没有增加。

sum0 = foldl' (+) 0 (replicate 10000000 0)

但是,这似乎不存在复杂的数据结构。例如,如果我们为成对的数字定义添加并计算零对的总和,则RAM使用情况大大增加:

(x1,y1) <+> (x2,y2) = (x1 + x2,y1 + y2)
sum00 = foldl' (<+>) (0,0) (replicate 10000000 (0,0))

为什么会发生这种情况?有没有办法减少RAM使用情况?

Lazy fold uses a lot of RAM. In Data.List, foldl' provides a left fold that uses strict evaluation. For example, the following computes the sum of 10 million zeros with little increase in RAM usage.

sum0 = foldl' (+) 0 (replicate 10000000 0)

However, this does not seem to hold with complex data structures. For example, if we define addition for pairs of numbers and compute the sum of zero pairs, there is significant increase in RAM usage:

(x1,y1) <+> (x2,y2) = (x1 + x2,y1 + y2)
sum00 = foldl' (<+>) (0,0) (replicate 10000000 (0,0))

Why does that happen? Is there a way to decrease RAM usage?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

得不到的就毁灭 2025-01-24 08:32:10

foldl'仅评估中间状态到弱的正态形式,即以第一个构造函数为单位。这是最通用的函数 can 的功能,以及所谓的“严格”通常可以使用的功能。评估(x1,y1)&lt;+&gt; (x2,y2)直到看起来像构造函数给出(x1 + x2,y1 + y2),其中零件仍未被估算(它们已被(,))。通过迭代,foldl'严格将状态保持在(_,_)而不是(_,_,_)&lt;+&gt; (_,_)&lt;+&gt; ...,但是_ s成长为形式的巨大未评估条款_ + _ + _ + _ + ...

修改&lt;+&gt;在公开构造函数之前评估添加剂。

(x1, y1) <+> (x2, y2) = x `seq` y `seq` (x, y)
  where x = x1 + x2; y = y1 + y2
-- or
(x1, y1) <+> (x2, y2) = ((,) $! x1 + y1) $! x2 + y2
-- or (with deepseq package)
(x1, y1) <+> (x2, y2) = force (x1 + x2, y1 + y2)

-- x `seq` y = y, but only if x reaches WHNF
-- usually, evaluating x `seq` y to WHNF evaluates x (to WHNF) before it returns the result of evaluating y to WHNF
-- though that's not the official definition of `seq`, since Haskell nominally doesn't have an evaluation strategy
-- (and GHC's actual `seq` may do something different if GHC is feeling smart)

foldl' only evaluates the intermediate state to weak head normal form—i.e. up to the first constructor. That's the most a generic function can do, and what functions that are called "strict" generally do. Evaluating (x1, y1) <+> (x2, y2) until it looks like a constructor gives (x1 + x2, y1 + y2), where the parts are still unevaluated (they have been "protected" by the (,)). Through the iteration, foldl' being strict keeps the state in the form (_, _) instead of (_, _) <+> (_, _) <+> ..., but the _s grow into huge unevaluated terms of form _ + _ + _ + ....

Modify <+> to evaluate the additions before it exposes the constructor.

(x1, y1) <+> (x2, y2) = x `seq` y `seq` (x, y)
  where x = x1 + x2; y = y1 + y2
-- or
(x1, y1) <+> (x2, y2) = ((,) $! x1 + y1) $! x2 + y2
-- or (with deepseq package)
(x1, y1) <+> (x2, y2) = force (x1 + x2, y1 + y2)

-- x `seq` y = y, but only if x reaches WHNF
-- usually, evaluating x `seq` y to WHNF evaluates x (to WHNF) before it returns the result of evaluating y to WHNF
-- though that's not the official definition of `seq`, since Haskell nominally doesn't have an evaluation strategy
-- (and GHC's actual `seq` may do something different if GHC is feeling smart)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文