Haskell——产生更少火花的平行映射

发布于 2024-11-06 18:24:21 字数 983 浏览 3 评论 0原文

我想在 Haskell 中编写一个尽可能高效的并行映射函数。我最初的尝试(似乎是目前最好的)就是简单地编写,

pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f

但是我没有看到完美的 CPU 划分。如果这可能与火花数量有关,我可以编写一个 pmap 将列表分为 # of cpus 段,以便创建最少的火花吗?我尝试了以下方法,但性能(和火花数量)要差得多,

pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
    -- the (len / 4) argument represents the size of the sublists
    chunk xs = chunk' ((length xs) `div` 4) xs
    chunk' n xs | length xs <= n = [xs]
                | otherwise = take n xs : chunk (drop n xs)

性能越差可能与内存使用量越高相关。原始的 pmap 在 24 核系统上确实有所扩展,所以并不是我没有足够的数据。 (我的桌面上的 CPU 数量是 4,所以我只是对其进行了硬编码)。

编辑1

使用+RTS -H512m -N -sstderr -RTS的一些性能数据在这里:

I want to write a parallel map function in Haskell that's as efficient as possible. My initial attempt, which seems to be currently best, is to simply write,

pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f

I'm not seeing perfect CPU division, however. If this is possibly related to the number of sparks, could I write a pmap that divides the list into # of cpus segments, so there are minimal sparks created? I tried the following, but the peformance (and number of sparks) is much worse,

pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
    -- the (len / 4) argument represents the size of the sublists
    chunk xs = chunk' ((length xs) `div` 4) xs
    chunk' n xs | length xs <= n = [xs]
                | otherwise = take n xs : chunk (drop n xs)

The worse performance may be correlated with the higher memory use. The original pmap does scale somewhat on 24-core systems, so it's not that I don't have enough data.
(The number of CPU's on my desktop is 4, so I just hardcoded that).

Edit 1

Some performance data using +RTS -H512m -N -sstderr -RTS is here:

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

远山浅 2024-11-13 18:24:21

parallel 包为您定义了许多并行地图策略

parMap :: Strategy b -> (a -> b) -> [a] -> [b]

: parList 和 Map 的组合,以及对列表分块的特定支持:

parListChunk :: Int -> Strategy a -> Strategy [a]

将列表分为块,并将策略 evalList strat 并行应用于每个块。

您应该能够结合使用这些方法来获得您想要的任何激发行为。或者,为了更多的控制,Par monad 包,用于控制创建的线程数量(纯粹)。


参考文献:并行包的 haddock 文档

The parallel package defines a number of parallel map strategies for you:

parMap :: Strategy b -> (a -> b) -> [a] -> [b]

A combination of parList and map, and specific support for chunking the list:

parListChunk :: Int -> Strategy a -> Strategy [a]

Divides a list into chunks, and applies the strategy evalList strat to each chunk in parallel.

You should be able to use a combination of these to get any sparking behavior you desire. Or, for even more control, the Par monad package, for controlling the amount of threads created (purely).


References: The haddock docs for the parallel package

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文