F# 树上并行化函数的选项
在我的项目中,我有一个如下表示的数据结构:
type 'a Tree =
| Leaf of 'a
| Node of 'a Tree array
由于遍历大树的成本,我必须在此数据结构上并行化以下一些函数:
- 存在映射
- 折叠
- (存在满足 谓词)
- 减少/转换(可选)
由于我正在处理的问题的性质,每个节点的分支数量各不相同,并且叶级别的工作负载非常小。 第一个问题是我应该考虑在树上并行执行什么选项。但是,我尝试在每个节点上使用 Array.Parallel 模块中的函数,因为并行度太大,并行版本甚至比顺序版本慢。如果有必要,我可以将数组表示更改为 List
或 PSeq
。
第二个问题是如何控制这些函数的并行度。我正在考虑通过树的深度、每个节点的分支数量、叶级的工作负载复杂性以及叶级的叶数进行控制。然而,将它们组合在一起似乎是复杂且不可预测的。
In my project, I have a data structure represented as follows:
type 'a Tree =
| Leaf of 'a
| Node of 'a Tree array
Due to the cost of traversing through large trees, I have to parallelize some following functions on this data structure:
- map
- fold
- exists (exist a node satisfying a
predicate) - reduce / transform (optional)
Because of nature of the problem I'm working on, number of branches at each node is varied and workloads at the leaf level are quite small. The first question is what options should I consider for parallel execution on tree. I'm trying to use functions from Array.Parallel
module on every node, however, because overheads of parallelism is too big, the parallel version is even slower than the sequential version. I may change array representation to List
or PSeq
if it is necessary.
The second question is how to control degree of parallelism on those functions. I'm thinking about controlling by depth of tree, number of branches at each node, workload complexity at the leaf level and number of leaves on tree, however, combining them together seems to be complex and unpredictable.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
将遍历与其他处理分开怎么样?也许创建一个工作队列(
MailboxProcessor
是一个很好的起点),并在遍历树时将其他工作排入队列以进行后台处理。它不能解决并行遍历问题(这似乎很难适应所有情况),但由于附加处理被降级到后台,它应该会很快。您可以尝试后台工作人员的数量,直到找到良好的并行度。这一切都假设每个节点要完成的工作量并不小。编辑
这是一些代码。我确信它可以改进。我必须很快地把它敲出来。但这显示了基本概念。它只有一个“后台工作者”,即
MailboxProcessor
。我将保留更新它以使用多个工作人员的想象力。How about separating traversal from any other processing? Perhaps create a work queue (
MailboxProcessor
is a good starting point) and, as the tree is traversed, enqueue additional work for background processing. It doesn't solve the parallel traversal problem (which seems tricky to get right for all cases) but with additional processing relegated to the background, it should go pretty quickly. You can experiment with the number of background workers until you find a good degree of parallelism. This all assumes the amount of work to be done for each node is non-trivial.EDIT
Here's some code. I'm sure it can be improved. I had to hammer it out pretty quickly. But this shows the basic concept. It only has one "background worker," i.e., the
MailboxProcessor
. I'll leave updating it to use multiple workers to the imagination.Array.Parallel
使用System.Threading.Parallel.For
。当您调用该函数时,它会尝试为给定任务找到最佳计划。然而,对于典型的递归树算法,这意味着对 Parallel.For 的大量调用,并且您最终可能会得到太多的线程。 (除非Parallel.For
针对我不知道的用例进行了优化。)所以我认为如果每个节点的工作负载不是太小的话,丹尼尔的建议是一个好主意。另一种想法是引入相对于树的剩余深度的阈值,如 Stephen Toub 在 此博客条目。
Array.Parallel
usesSystem.Threading.Parallel.For
. When you call that function, it tries to find an optimal schedule for the given task. However, with a typical recursive tree algorithm, this means a lot of calls toParallel.For
and you probably end up with far too many threads. (UnlessParallel.For
is optimized for that use case which I don't know.)So I think Daniel's suggestion is a good idea if the workload per node is not too small. An alternative idea is to introduce a threshold with respect to the remaining depth of a tree as described by Stephen Toub at the end of this blog entry.