标量代码和并行代码之间的不同行为

发布于 2024-12-19 08:48:42 字数 752 浏览 2 评论 0原文

我想知道为什么以下代码在其标量和并行变体中产生不同的结果：

#define N 10
double P[N][N];
// zero the matrix just to be sure...
for (int i=0; i<N; i++)
    for(int j=0; j<N; j++)
        P[i][j]=0.0;


double xmin=-5.0,ymin=-5.0,xmax=5.0,ymax=5.0;
double x=xmin,y=ymin;
double step= abs(xmax-xmin)/(double)(N - 1 );
for (int i=0; i<N; i++)
{
    #pragma omp parallel for ordered schedule(dynamic)
    for ( int j=0; j<N; j++)
    {
        x = i*step+xmin;
        y = j*step+ymin;
        P[i][j]=x+y;
    }
}

此代码在其两个版本中产生不完全相等的结果（标量版本仅注释了 #pragma ... 部分出去）。我注意到并行版本中 P[i][j] 的元素中有一小部分与标量版本不同，但我想知道为什么...... 按照建议

将 #pragma 放在外循环上是一团糟......完全错误的结果。

聚苯乙烯 g++-4.4、英特尔 i7、Linux

原文

I'm wondering why the following code produces different results in its scalar and parallel variants:

#define N 10
double P[N][N];
// zero the matrix just to be sure...
for (int i=0; i<N; i++)
    for(int j=0; j<N; j++)
        P[i][j]=0.0;


double xmin=-5.0,ymin=-5.0,xmax=5.0,ymax=5.0;
double x=xmin,y=ymin;
double step= abs(xmax-xmin)/(double)(N - 1 );
for (int i=0; i<N; i++)
{
    #pragma omp parallel for ordered schedule(dynamic)
    for ( int j=0; j<N; j++)
    {
        x = i*step+xmin;
        y = j*step+ymin;
        P[i][j]=x+y;
    }
}

This code produces not completely equal results in its two version (the scalar version has just the #pragma ... part commented out).
What I've noticed is that a very small percentual of the elements of P[i][j] in the parallel version are different from those of the scalar version, but I'm wondering why...

Putting the #pragma on the outer loop as suggested is mess...completely wrong results.

P.S.
g++-4.4, intel i7, linux

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

画离情绘悲伤 2024-12-26 08:48:42

啊，现在我明白问题所在了。您对最后一个问题的评论没有足够的上下文让我看到它。但现在很清楚了。

问题在于：

    x = i*step+xmin;
    y = j*step+ymin;

x 和 y 是在并行区域之外声明的，因此它们在所有线程之间共享。（因此所有线程之间会出现令人讨厌的竞争条件...）

要修复它，请将它们设置为本地：

for ( int j=0; j<N; j++)
{
    double x = i*step+xmin;
    double y = j*step+ymin;
    P[i][j]=x+y;
}

通过此修复，您应该能够将 #pragma 放在外部循环上，而不是内循环。

Ah, now I can see the problem. Your comment on the last question didn't have enough context for me to see it. But now it's clear.

The problem is here:

    x = i*step+xmin;
    y = j*step+ymin;

x and y are declared outside the parallel region, so they are being shared among all the threads. (and thus a nasty race condition among all the threads...)

To fix it, make them local:

for ( int j=0; j<N; j++)
{
    double x = i*step+xmin;
    double y = j*step+ymin;
    P[i][j]=x+y;
}

With this fix, you should be able to put the #pragma on the outer loop instead of the inner loop.

回复收藏 0 原文

~没有更多了~

关于作者

终止放荡

暂无简介

文章

661 人气

关注发私信

友情链接

文江博客

标量代码和并行代码之间的不同行为

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

标量代码和并行代码之间的不同行为

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。