#pragma 末尾的隐式屏障

发布于 2024-12-29 05:19:47 字数 1573 浏览 0 评论 0原文

朋友们,我正在尝试学习 openMP 范例。 我使用以下代码来理解#omp for pragma。

int main(void){
int tid;
int i;

omp_set_num_threads(5);
#pragma omp parallel \
    private(tid)
{
    tid=omp_get_thread_num();
    printf("tid=%d started ...\n", tid);
    fflush(stdout);

    #pragma omp for
    for(i=1; i<=20; i++){
        printf("t%d - i%d \n",
                omp_get_thread_num(), i);
        fflush(stdout);
    }

    printf("tid=%d work done ...\n", tid);
}

return 0;

在上面的代码

中,#pragma omp parallel 末尾有一个隐式屏障,这意味着所有线程 0,1,2,3,4 必须到达那里才能进入下一条语句。

因此,为了检查这个障碍,我将这个“pragma for”包含在条件 if(tid!=0) 中,这意味着除线程 0(即 1,2,3,4)之外的所有线程都应该在循环中完成其工作并等待 thread0无限期地。但是,令我惊讶的是,这并没有发生。每个线程都在进行迭代并成功完成。即 t1 完成迭代 5,6,7,8 ---- t2 完成 9,10,11,12 ---- t3 完成 13,14,15,16,t4 完成 17,18,19,20。请注意:迭代 1、2、3、4 从未完成。

为了更深入地挖掘,我在 tid!=1 中包含了相同的#pragma for in tid!=1,而不是 tid!=0,这意味着 thread1 绕过了屏障,而不是 thread0。令我惊讶的是,程序现在挂起,所有线程都在等待 thread1。

有人可以告诉我这种意外行为的解释吗?最终挂起的代码:

int main(void){
int tid;
int i;

omp_set_num_threads(5);
#pragma omp parallel \
    private(tid)
{
    tid=omp_get_thread_num();
    printf("tid=%d started ...\n", tid);
    fflush(stdout);

    if(tid!=1){
        /* worksharing */
        #pragma omp for
        for(i=1; i<=20; i++){
            printf("t%d - i%d \n", 
                omp_get_thread_num(), i);
            fflush(stdout);
        }
    }else{
        printf("t1 reached here. \n");
    }

    printf("tid=%d work done ...\n", tid);
}

return 0;

}

我尝试设置共享或私有,但它并没有改变程序的行为。

Friends, I am trying to learn the openMP paradigm.
I used the following code to understand the #omp for pragma.

int main(void){
int tid;
int i;

omp_set_num_threads(5);
#pragma omp parallel \
    private(tid)
{
    tid=omp_get_thread_num();
    printf("tid=%d started ...\n", tid);
    fflush(stdout);

    #pragma omp for
    for(i=1; i<=20; i++){
        printf("t%d - i%d \n",
                omp_get_thread_num(), i);
        fflush(stdout);
    }

    printf("tid=%d work done ...\n", tid);
}

return 0;

}

In the above code, there is an implicit barrier at the end of #pragma omp parallel, meaning all the threads 0,1,2,3,4 must reach there before going to the next statement.

So, to check this barrier, I enclosed this "pragma for" in a condition if(tid!=0), meaning all threads except thread 0 i.e 1,2,3,4 should complete their work in the loop and wait for thread0 indefinitely. But, to my surprise this is not happening. Every thread is doing its iteration and completing successfully. i.e t1 completes iterations 5,6,7,8 ---- t2 does 9,10,11,12 ---- t3 does 13,14,15,16 and t4 does 17,18,19,20. Please note : iteration 1,2,3,4 were never completed.

To dig deeper, instead of tid!=0, I enclosed the same #pragma for in tid!=1 meaning instead of thread0, thread1 bypasses the barrier. To my surprise, the program now hangs and all threads wait for the thread1.

Can somebody please tell me the explanation for such unexpected behavior. Final code that hanged :

int main(void){
int tid;
int i;

omp_set_num_threads(5);
#pragma omp parallel \
    private(tid)
{
    tid=omp_get_thread_num();
    printf("tid=%d started ...\n", tid);
    fflush(stdout);

    if(tid!=1){
        /* worksharing */
        #pragma omp for
        for(i=1; i<=20; i++){
            printf("t%d - i%d \n", 
                omp_get_thread_num(), i);
            fflush(stdout);
        }
    }else{
        printf("t1 reached here. \n");
    }

    printf("tid=%d work done ...\n", tid);
}

return 0;

}

I tried setting i shared or private, but it did not change the behavior of the program.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

缘字诀 2025-01-05 05:19:47

这里的问题是标准未定义该行为。来自 OpenMP 3.1 规范 第 2.5 节第 21 行(但文本或多或少保持不变)从一开始就少了):

• 团队中的所有线程必须遇到每个工作共享区域
或者根本没有。

其中 omp for 是工作共享结构。所以是的,我通常也希望你的代码挂起,但编译器有权假设你所做的事情永远不会发生,所以最终结果 - 它有时挂起,但有时不会挂起,具体取决于细节你坚持哪些话题——也许并不令人惊讶。

The problem here is that the behaviour is undefined by the standard. From Section 2.5, line 21 of the OpenMP 3.1 specification (but the text has stayed the same more or less since the beginning):

• Each worksharing region must be encountered by all threads in a team
or by none at all.

Where omp for is a worksharing construct. So yes, I too would normally expect a hang with your code, but the compiler is entitled to assume that what you're doing never happens, and so the end result -- it sometimes hangs but sometimes doesn't, depending on the details on which threads you hold up -- maybe isn't that surprising.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文