使用Parallel.Foreach(多个线程)从列表中获取总和到值的元素数组
我正在使用以下代码从列表中获取总和到值的元素数组。但是,它仅使用1个CPU。我的PC有64个核心,因此我想使用100%的CPU加快流程,因此您可以帮助我更新代码以使用 Parallel.Foreach.foreach
(多个线程)吗?
using System;
using System.Collections.Generic;
using System.Linq;
IEnumerable<List<int>> subset_sum(IEnumerable<int> numbers, int target,
IEnumerable<int> partial = null, int partial_sum = 0)
{
partial ??= Enumerable.Empty<int>();
if (partial_sum == target) yield return partial.ToList();
if (partial_sum >= target) yield break;
foreach (var (n, i) in numbers.Select((n, i) => (n, i)))
{
var remaining = numbers.Skip(i + 1);
foreach (var subset in subset_sum(
remaining, target, partial.Append(n), partial_sum + n))
yield return subset;
}
}
var numbers = new List<int> { 3, 9, 8, 4, 5, 7, 10 };
var target = 15;
foreach (var subset in subset_sum(numbers, target))
Console.WriteLine($"[{string.Join(", ", subset)}] = {subset.Sum()}");
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这篇文章不是关于如何并行化子集找到算法的答案。它只是表明可以更快的单线程实现。以下实现不是递归的。它使用
stack&lt; int
作为查找源序列的所有组合的机制(待处理
stack)。它的速度比我的PC中的递归实现快7-8倍。This post is not an answer on how to parallelize the subset-finding algorithm. It just shows that a faster single-thread implementation is possible. The implementation below is not recursive. It uses a
Stack<int>
as a mechanism for finding all the combinations of the source sequence (thepending
stack). It is about 7-8 times faster than the recursive implementation in my PC.并行化递归算法并不容易。当工作的功能可以递归创造更多的工作时,如何划分工作量并不明显。在这种特殊情况下,可以通过将
数字
分为两个部分,并将两个部分之一作为分区进行分配。然后,对于每个分区,在另一部分中搜索组合的总和等于target
减去分区的总和。以下是该想法的实现。 plinq 并行化:此实现并不比原始的单线程方法快得多,因为递归算法相当涉及内存且效率低下。比较 执行前后,记忆分配是巨大的:我的PC中每秒约700 MB。因此,而不是瓶颈是CPU,而是内存总线的带宽。 CPU内核主要在等待数据存储或从主内存中获取,而不是计算。至少这是我的理论。要进行比较,我在我的其他答案,产生合理的性能改进(我的四核PC上约为2.4倍)。
Parallelizing recursive algorithms is not easy. It's not obvious how to partition the workload, when the function that does the work can create more work recursively. In this particular case it is possible to partition the work by splitting the
numbers
into two parts, and using one of the two parts as the partitions. Then for each partition search for combinations in the other part that have a sum equal to thetarget
minus the sum of the partition. Below is an implementation of this idea. The PLINQ library is used for the parallelization:This implementation is not much faster than the original single-thread method, because the recursive algorithm is quite memory-hungry and inefficient. Comparing the value of the
GC.GetTotalAllocatedBytes(true)
before and after the execution reveals that the memory allocation is massive: around 700 MB per second in my PC. So instead of the bottleneck being the CPU, it's the bandwidth of the memory bus. The CPU cores are mostly waiting for data to be stored or be fetched from the main memory, instead of calculating. At least this is my theory. For comparison parallelizing the non-recursive algorithm that I've posted in my other answer, yields reasonable performance improvements (around 2.4x on my quad core PC).