使用 monad 的并行策略

发布于 2024-10-10 09:30:19 字数 181 浏览 0 评论 0原文

我经常看到与纯计算相关的 Haskell 并行策略的用法和解释（例如 fib）。但是，我不经常看到它与一元结构一起使用：当应用于 ST 或 IO 时，是否有对 par 和相关函数的效果的合理解释？这样的用法会带来任何加速吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乖乖哒 2024-10-17 09:30:19

IO monad 中的并行性更准确地称为“并发”，并且由 forkIO 和 Control.Concurrent 模块中的朋友支持。

并行化 ST monad 的困难在于 ST 必然是单线程的——这就是它的目的。 ST monad 有一个惰性变体，Control.Monad.ST.Lazy，原则上它可以支持并行计算，但我不知道有人尝试这样做。

有一个用于并行评估的新 monad，名为 Eval，可以在最新版本的并行包。我建议将 Eval monad 与 rpar 和 rseq 一起使用，而不是 par 和 pseq如今，因为它可以带来更健壮和可读的代码。例如，通常的fib示例可以写成

fib n = if n < 2 then 1 else 
        runEval $ do
           x <- rpar (fib (n-1))
           y <- rseq (fib (n-2))
           return (x+y)

Parallelism in the IO monad is more correctly called "Concurrency", and is supported by forkIO and friends in the Control.Concurrent module.

The difficulty with parallelising the ST monad is that ST is necessarily single-threaded - that's its purpose. There is a lazy variant of the ST monad, Control.Monad.ST.Lazy, which in principle could support parallel evaluation, but I'm not aware of anyone having tried to do this.

There's a new monad for parallel evaluation called Eval, which can be found in recent versions of the parallel package. I recommend using the Eval monad with rpar and rseq instead of par and pseq these days, because it leads to more robust and readable code. For example, the usual fib example can be written

fib n = if n < 2 then 1 else 
        runEval $ do
           x <- rpar (fib (n-1))
           y <- rseq (fib (n-2))
           return (x+y)

回复收藏 0 原文

半边脸i 2024-10-17 09:30:19

在某些情况下这是有意义的，但一般来说您不应该这样做。检查以下内容：

doPar =
  let a = unsafePerformIO $ someIOCalc 1
      b = unsafePerformIO $ someIOCalc 2
  in a `par` b `pseq` a+b

在 doPar 中，引发 a 的计算，然后主线程计算 b。但是，主线程完成 b 的计算后，可能也会开始计算 a 。现在您有两个线程评估 a，这意味着某些 IO 操作将执行两次（或可能更多）。但是，如果一个线程完成对 a 的评估，另一个线程就会放弃迄今为止所做的事情。为了确保这一点的安全，您需要满足以下几点：

多次执行 IO 操作是安全的。
仅执行部分 IO 操作是安全的（例如，无需清理）。IO
操作不存在任何竞争条件。如果一个线程在计算 a 时改变了一些数据，那么另一个也在处理 a 的线程会表现得明智吗？可能不会。
任何外部调用都是可重入的（当然，一般情况下您需要这个来实现并发）

如果您的 someIOCalc 看起来像这样，

someIOCalc n = do
  prelaunchMissiles
  threadDelay n
  launchMissiles

那么将其与 par 和 一起使用绝对不安全>unsafePerformIO。

现在，这值得吗？或许。 Spark 很便宜，甚至比线程更便宜，所以理论上它应该是性能增益。实际上，也许没有那么多。 Roman Leschinsky 有一篇关于此的精彩博客文章。

就我个人而言，我发现 forkIO 的推理要简单得多。

There are some situations where this makes sense, but in general you shouldn't do it. Examine the following:

doPar =
  let a = unsafePerformIO $ someIOCalc 1
      b = unsafePerformIO $ someIOCalc 2
  in a `par` b `pseq` a+b

in doPar, a calculation for a is sparked, then the main thread evaluates b. But, it's possible that after the main thread finishes the calculation of b it will begin to evaluate a as well. Now you have two threads evaluating a, meaning that some of the IO actions will be performed twice (or possibly more). But if one thread finishes evaluating a, the other will just drop what it's done so far. In order for this to be safe, you need a few things to be true:

It's safe for the IO actions to be performed multiple times.
It's safe for only some of the IO actions to be performed (e.g. there's no cleanup)
The IO actions are free of any race conditions. If one thread mutates some data when evaluating a, will the other thread also working on a behave sensibly? Probably not.
Any foreign calls are re-entrant (you need this for concurrency in general of course)

If your someIOCalc looks like this

someIOCalc n = do
  prelaunchMissiles
  threadDelay n
  launchMissiles

it's absolutely not safe to use this with par and unsafePerformIO.

Now, is it ever worth it? Maybe. Sparks are cheap, even cheaper than threads, so in theory it should be a performance gain. In practice, perhaps not so much. Roman Leschinsky has a nice blog post about this.

Personally, I've found it much simpler to reason about forkIO.

回复收藏 0 原文

~没有更多了~