如何使序列化并行运行
伙计们, 如果我有一个树结构,并且我想序列化树和子节点。如何并行地对每个节点进行序列化。 如果我给每个节点分配一个独立的任务,输出的数据就会乱序。 并发序列化有某种模式吗?
编辑:如果结构不是树,而是 DAG?如何处理这个结构?如何序列化DAG并使序列化并发。
guys,
If I have a tree structure, and I want to serialize the tree and sub nodes. how to do the serialization for each nodes in parallel.
If I assign each node with a independent task, the output data will be disordered.
Is there some pattern for concurrent serialization?
Edit: If the structure is not a tree, but a DAG? How to handle this structure? How to serialize DAG and make the serialization to be concurrent.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是递归并行性或 Fork/Join 并行性的理想问题。
在树的每个级别,生成一个任务以将每个节点序列化到临时缓冲区,然后等待这些任务并加入缓冲区。例如(假设是二叉树)
显然,您需要检查空树或叶节点或其他什么,但这应该给您一个想法。
编辑:要处理 DAG,您可以将与异步调用的依赖项关联的未来传递给异步调用,因此每个任务都会显式等待它需要完成的任务。
This is an ideal problem for recursive parallelism, or Fork/Join parallelism.
At each level in the tree, spawn a task to serialize each of the nodes to a temporary buffer, then wait for those tasks and join the buffers. e.g. (assuming a binary tree)
Obviously, you'll want to check for empty trees or leaf nodes, or whatever, but this should give you an idea.
EDIT: To handle a DAG, you can pass in the futures associated with the dependencies to the async calls, so each task explicitly waits for the tasks it requires to have finished.
从你非常简短的描述来看,不清楚你有什么限制,所以我会笼统地回答。
如果您想并行处理树中的节点,同时希望保持处理结果有序,您可以这样做。
这里通过在整个链中保留 #1 中给出的数字来保持顺序。
步骤#2 可能是一个线程池支持的类,您只需将项目(数字节点对)添加到其中。
From your very brief description it's not clear what constraints you have so I'll answer generally.
If you want to process nodes in a tree in parallel while you want to keep the result of the processing ordered, you could do something like this.
Here the ordering is kept by keeping the number given in #1 through the whole chain.
Step #2 could be a threadpool-backed class you just add items (number-node pairs) into.
大多数构建工具编译软件时都会在 DAG 上运行并行作业。在单线程/进程环境中,经典的解决方案是使用拓扑排序对任务进行排序然后按该顺序处理作业。
然而,在多线程/进程环境中,您必须确保节点的任务在其依赖项完成之前不会被处理。这意味着您必须维护一个包含阻塞工作人员的队列。您还必须维护队列中的节点已准备就绪的不变式(它们的依赖项已完成处理)。
一种可能的实现是为每个节点维护一个依赖计数器;当其依赖项之一完成时,减少计数器。如果计数器达到 0,则将该节点插入队列。
Running parallel jobs on a DAG is what most build tools do to compile your software. In a single-thread/process environment, the classic solution is to order the tasks using a topological sort and then process the jobs in that order.
In a multi-thread/process environment, however, you must make sure that a node's task is not processed before it's dependencies are finished. This means that you have to maintain a queue with blocking workers. You also have to maintain the invariant that nodes in the queue are ready (their dependencies are finished processing).
One possible implementation is to maintain a dependency counter for each node; when one of its dependencies is completed, decrease the counter. If the counter reaches 0, insert the node in the queue.