考虑二分算法来求平方根。每一步都取决于前一步,所以我认为不可能并行化它。我错了吗?
还可以考虑类似的算法,例如二分搜索。
编辑
我的问题不是二分,但它非常相似。我有一个单调函数 f(mu)
,我需要找到 f(mu) 处的 mu。一个核心需要 2 分钟来计算 f(mu)
,而且我需要非常高的精度。我们有一个大约 100 个核心的农场。我的第一次尝试是仅使用 1 个核心,然后使用动态步骤扫描 f
的所有值,具体取决于我与 alpha
的接近程度。现在我想使用整个农场,但我唯一的想法是在等间距点计算 100 个 f
值。
Consider the bisection algorithm to find square root. Every step depends on the previous, so in my opinion it's not possibile to parallelize it. Am I wrong?
Consider also similar algorithm like binary search.
edit
My problem is not the bisection, but it is very similar. I have a monotonic function f(mu)
and I need to find the mu where f(mu)<alpha
. One core need 2 minutes to compute f(mu)
and I need a very big precision. We have a farm of ~100 cores. My first attemp was to use only 1 core and then scan all value of f
with a dynamic step, depending on how close I am to alpha
. Now I want to use the whole farm, but my only idea is to compute 100 value of f
at equal spaced points.
发布评论
评论(3)
这取决于您所说的并行化的含义以及并行化的粒度。例如,您可以使用指令级并行性(例如 SIMD)来查找一组输入值的平方根。
二分搜索比较棘手,因为控制流依赖于数据,迭代次数也是如此,但只要允许最大迭代次数 (log2 N),您仍然可以并行执行多个二分搜索。
It depends on what you mean by parallelize, and at what granularity. For example you could use instruction level parallelism (e.g. SIMD) to find square roots for a set of input values.
Binary search is trickier, because the control flow is data-dependent, as is the number of iterations, but you could still conceivably perform a number of binary searches in parallel so long as you allow for the maximum number of iterations (log2 N).
即使这些算法可以并行化(我不确定它们可以),这样做也没有什么意义。
一般来说,尝试并行化已经具有次线性时间界限(即 T < O(n))的算法没有什么意义。这些算法已经非常快,额外的硬件影响很小。
此外,(一般来说)所有具有数据依赖性的算法都不能并行化是不正确的。例如,在某些情况下,可以建立一个管道,其中不同的功能单元并行操作并在它们之间按顺序馈送数据。特别是图像处理算法通常适合这种安排。
没有这种数据依赖性(因此不需要在处理器之间通信)的问题被称为“令人尴尬的并行”。这些问题代表了所有可并行化问题空间的一小部分。
Even if these algorithms could be parallelized (and I'm not sure they can), there is very little point in doing so.
Generally speaking, there is very little point in attempting to parallelize algorithms that already have sub-linear time bounds (that is, T < O(n)). These algorithms are already so fast that extra hardware will have very little impact.
Furthermore, it is not true (in general) that all algorithms with data dependencies cannot be parallelized. In some cases, for example, it is possible to set up a pipeline where different functional units operate in parallel and feed data sequentially between them. Image processing algorithms, in particular, are frequently amenable to such arrangements.
Problems with no such data dependencies (and thus no need to communicate between processors) are referred to as "embarrassingly parallel". Those problems represent a small subset of the space of all problems that can be parallelized.
许多算法都有几个步骤,每个步骤依赖取决于上一步,有些算法可以改变步骤以进行并行,有些则无法并行,我认为 BinarySearch 是第二种类型,你没有错,但是你可以将二分搜索与多重搜索并行。
Many algorithms have several steps that each step depend on previous step,Some those algorithm can changed steps to doing parallel and some impossible to parallel, I think BinarySearch is of second type, You not wrong, But you can paralleled binary search with multiple Search.