Haskell——产生更少火花的平行映射

发布于 2024-11-06 18:24:21 字数 983 浏览 10 评论 0原文

我想在 Haskell 中编写一个尽可能高效的并行映射函数。我最初的尝试（似乎是目前最好的）就是简单地编写，

pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f

但是我没有看到完美的 CPU 划分。如果这可能与火花数量有关，我可以编写一个 pmap 将列表分为 # of cpus 段，以便创建最少的火花吗？我尝试了以下方法，但性能（和火花数量）要差得多，

pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
    -- the (len / 4) argument represents the size of the sublists
    chunk xs = chunk' ((length xs) `div` 4) xs
    chunk' n xs | length xs <= n = [xs]
                | otherwise = take n xs : chunk (drop n xs)

性能越差可能与内存使用量越高相关。原始的 pmap 在 24 核系统上确实有所扩展，所以并不是我没有足够的数据。（我的桌面上的 CPU 数量是 4，所以我只是对其进行了硬编码）。

编辑1

使用+RTS -H512m -N -sstderr -RTS的一些性能数据在这里：

原文

I want to write a parallel map function in Haskell that's as efficient as possible. My initial attempt, which seems to be currently best, is to simply write,

pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f

I'm not seeing perfect CPU division, however. If this is possibly related to the number of sparks, could I write a pmap that divides the list into # of cpus segments, so there are minimal sparks created? I tried the following, but the peformance (and number of sparks) is much worse,

pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
    -- the (len / 4) argument represents the size of the sublists
    chunk xs = chunk' ((length xs) `div` 4) xs
    chunk' n xs | length xs <= n = [xs]
                | otherwise = take n xs : chunk (drop n xs)

The worse performance may be correlated with the higher memory use. The original pmap does scale somewhat on 24-core systems, so it's not that I don't have enough data.
(The number of CPU's on my desktop is 4, so I just hardcoded that).

Edit 1

Some performance data using +RTS -H512m -N -sstderr -RTS is here:

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

远山浅 2024-11-13 18:24:21

parallel 包为您定义了许多并行地图策略

parMap :: Strategy b -> (a -> b) -> [a] -> [b]

： parList 和 Map 的组合，以及对列表分块的特定支持：

parListChunk :: Int -> Strategy a -> Strategy [a]

将列表分为块，并将策略 evalList strat 并行应用于每个块。

您应该能够结合使用这些方法来获得您想要的任何激发行为。或者，为了更多的控制，Par monad 包，用于控制创建的线程数量（纯粹）。

参考文献：并行包的 haddock 文档

The parallel package defines a number of parallel map strategies for you:

parMap :: Strategy b -> (a -> b) -> [a] -> [b]

A combination of parList and map, and specific support for chunking the list:

parListChunk :: Int -> Strategy a -> Strategy [a]

Divides a list into chunks, and applies the strategy evalList strat to each chunk in parallel.

You should be able to use a combination of these to get any sparking behavior you desire. Or, for even more control, the Par monad package, for controlling the amount of threads created (purely).

References: The haddock docs for the parallel package

回复收藏 0 原文

~没有更多了~