使用阿姆达尔定律计算性能增益

发布于 2025-01-04 20:22:25 字数 532 浏览 5 评论 0原文

我对阿姆达尔定律来确定性能增益和串行应用程序部分感到困惑，但未能弄清楚这一点。

已知的情况如下：

S(N) = Speedup factor for (N) CPU's
N = Number of CPU's
f = The part of the program which is executed sequential
S(N) = N / ( 1 + f * ( N - 1 ) )

如果我有 4 个 CPU，加速系数（性能增益）为 3 倍。 f 会是什么？

我的猜测：

S(N) = 3 (that's our performance gain using 4 CPU's)
N = 4

因此，在公式中输入这些值：

3 = 4 / ( 1 + f * ( 4 - 1 ) )

当我说 f = 0,11 时，我正确吗？或者我需要将 S(N) 设置为 1（因此除以 3）？或者我做错了什么？

原文

I am puzzling with Amdahl's Law to determine performance gains and the serial application part and fail to figure out this one.

Known is the following:

S(N) = Speedup factor for (N) CPU's
N = Number of CPU's
f = The part of the program which is executed sequential
S(N) = N / ( 1 + f * ( N - 1 ) )

If I have 4 CPU's and a speedup factor (performance gain) of 3x. What would f be?

My guess:

S(N) = 3 (that's our performance gain using 4 CPU's)
N = 4

So entering these values in the formula:

3 = 4 / ( 1 + f * ( 4 - 1 ) )

Am I correct when I say that f = 0,11? Or do I need to set S(N) to 1 (so divide by 3)? Or am I doing something else wrong?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

魔法少女 2025-01-11 20:22:26

我的一位同学对此给出了（到目前为止有效/正确的）答案。

我做了以下课程：
删除以消除混乱。

这应该可以解决它。

编辑：

好的，以前的答案是错误的，但我找到了解决方案。

您首先计算可以并行完成的部分（它在维基百科上，但我花了一段时间才理解），然后计算串行部分。

所以最后一堂课就变成了这样：

/**
* @param s - The maximum performance profit. 
* @param n - The amount of processors that are usable..
* @return f - The sequential part of the program.
*/
public final double sequentialCalculation(final double s, final double n) {
    double p = 0; //the part that can be executed in parallel
    double f = 0;

    if (s <= 0) {
        throw new IllegalArgumentException("S > 0");
    }

    if (n <= 0) {
        throw new IllegalArgumentException("N > 0");
    }

    p = ((1 / s) - 1) / ((1 / n) - 1);

    f = 1 - p;

    return f;
}

不客气。

A classmate of mine gave the (so far working/right) answer for this.

I made the following class:
REMOVED TO COUNTERACT CONFUSION.

This should solve it.

EDIT:

Ok, the previuos answer is wrong, but I found the solution.

You first calculate The part that can be done parallel (it's on wikipedia but it took me a while to understand) and THEN you calculate the serial part.

so the final class becomes this:

/**
* @param s - The maximum performance profit. 
* @param n - The amount of processors that are usable..
* @return f - The sequential part of the program.
*/
public final double sequentialCalculation(final double s, final double n) {
    double p = 0; //the part that can be executed in parallel
    double f = 0;

    if (s <= 0) {
        throw new IllegalArgumentException("S > 0");
    }

    if (n <= 0) {
        throw new IllegalArgumentException("N > 0");
    }

    p = ((1 / s) - 1) / ((1 / n) - 1);

    f = 1 - p;

    return f;
}

You are welcome.

回复收藏 0 原文

瘫痪情歌 2025-01-11 20:22:26

如果这是您应该使用的方程式，我认为您的想法有点错误，所以让我尝试解释一下。

f 是程序在单核实现中未并行化的代码部分所花费的时间的百分比（也称为 0 <= f <= 1）。例如，如果您有这样的程序：

// this takes 15 seconds
init();

for (int i = 0; i < 10; i++) {
    // this takes 10 seconds, and will be split
    // between threads when in parallel
    calculate();
}

// this takes 5 seconds
finalize();

这将在 15+(10*10)+5=120 秒内运行（串行）。但是，如果并行实施，则有 20 秒的执行时间无法分配给多个核心。这意味着即使并行部分加速，只需要 10 秒即可执行所有 10 次迭代，整个程序仍将花费 30 秒。这就是 f 告诉我们的——有多少问题可以从并行化中受益。在此示例中，由于总共 120 秒中的 20 秒必须连续完成，因此 f = 20/120 = 1/6。

使用这个新的 f 值，您可以根据 Amdahl 获得加速。一个免责声明 - 这远不是测量速度的唯一方法，不同的方法都有自己的优点和缺点。

I think you are thinking about it a little wrong if this is the equation you're supposed to be using, so let me try to explain.

f is the percentage (aka, 0 <= f <= 1) of the time your program spent in the part of the code that you didn't parallelize in the single core implementation. For example, if you have a program like this:

// this takes 15 seconds
init();

for (int i = 0; i < 10; i++) {
    // this takes 10 seconds, and will be split
    // between threads when in parallel
    calculate();
}

// this takes 5 seconds
finalize();

This will run (in serial) in 15+(10*10)+5=120 seconds. However, if implemented in parallel, there are 20 seconds of execution that can't be split amongst multiple cores. This means that even if the parallel piece is sped up so it only takes 10 seconds to perform all 10 iterations, the whole program will still take 30 seconds. This is what f helps tell us - how much of the problem can benefit from parallelization. In this example, as 20 seconds out of the total 120 must be done serially, f = 20/120 = 1/6.

Using this new value of f, you can get the speed up according to Amdahl. One disclaimer - this is far from the only way of measuring speed up, and different methods have their own advantages and disadvantages.

回复收藏 0 原文

~没有更多了~