Haskell——产生更少火花的平行映射
我想在 Haskell 中编写一个尽可能高效的并行映射函数。我最初的尝试(似乎是目前最好的)就是简单地编写,
pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f
但是我没有看到完美的 CPU 划分。如果这可能与火花数量有关,我可以编写一个 pmap 将列表分为 # of cpus 段,以便创建最少的火花吗?我尝试了以下方法,但性能(和火花数量)要差得多,
pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
-- the (len / 4) argument represents the size of the sublists
chunk xs = chunk' ((length xs) `div` 4) xs
chunk' n xs | length xs <= n = [xs]
| otherwise = take n xs : chunk (drop n xs)
性能越差可能与内存使用量越高相关。原始的 pmap 在 24 核系统上确实有所扩展,所以并不是我没有足够的数据。 (我的桌面上的 CPU 数量是 4,所以我只是对其进行了硬编码)。
编辑1
使用+RTS -H512m -N -sstderr -RTS
的一些性能数据在这里:
I want to write a parallel map function in Haskell that's as efficient as possible. My initial attempt, which seems to be currently best, is to simply write,
pmap :: (a -> b) -> [a] -> [b]
pmap f = runEval . parList rseq . map f
I'm not seeing perfect CPU division, however. If this is possibly related to the number of sparks, could I write a pmap that divides the list into # of cpus segments, so there are minimal sparks created? I tried the following, but the peformance (and number of sparks) is much worse,
pmap :: (a -> b) -> [a] -> [b]
pmap f xs = concat $ runEval $ parList rseq $ map (map f) (chunk xs) where
-- the (len / 4) argument represents the size of the sublists
chunk xs = chunk' ((length xs) `div` 4) xs
chunk' n xs | length xs <= n = [xs]
| otherwise = take n xs : chunk (drop n xs)
The worse performance may be correlated with the higher memory use. The original pmap does scale somewhat on 24-core systems, so it's not that I don't have enough data.
(The number of CPU's on my desktop is 4, so I just hardcoded that).
Edit 1
Some performance data using +RTS -H512m -N -sstderr -RTS
is here:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
parallel 包为您定义了许多并行地图策略
: parList 和 Map 的组合,以及对列表分块的特定支持:
将列表分为块,并将策略
evalList strat
并行应用于每个块。您应该能够结合使用这些方法来获得您想要的任何激发行为。或者,为了更多的控制,
Par
monad 包,用于控制创建的线程数量(纯粹)。参考文献:并行包的 haddock 文档
The parallel package defines a number of parallel map strategies for you:
A combination of parList and map, and specific support for chunking the list:
Divides a list into chunks, and applies the strategy
evalList strat
to each chunk in parallel.You should be able to use a combination of these to get any sparking behavior you desire. Or, for even more control, the
Par
monad package, for controlling the amount of threads created (purely).References: The haddock docs for the parallel package