Intel编译器(和Clang)的OpenMP任务问题

发布于 2025-02-12 07:03:27 字数 3580 浏览 1 评论 0原文

下面的代码显示了ICL 2021.6.0和ICX 2022.1.0(基于Clang)中OpenMP任务的问题 首先,我想知道我是否在OpenMP代码中做出了根本上错误的事情,并且在由不同的编译器编译时,它的显示方式不同。 假设代码是有效的OpenMP ... 当功能Fails_intel_icl()在ICL下运行时,任务执行是错误的。有些任务是运行两次,有些根本没有。由ICX/Clang编译,它会按照我的期望执行。 当crash_icx_2022()在ICX下编译时,它只是在运行时崩溃。我正在使用Visual Studio 20222/debug/x64以及最新的Oneapi基础和HPC安装进行测试。

函数不正确的运行时行为示例fails_intel_icl()使用ICL编译时如下

:12启动任务0,1< ----您会注意到,对0,1的任务从未运行。

线程:12启动任务0,2

线程:9用配对0,2执行任务 ....

#include <iostream>
#include <vector>
#include <omp.h>

std::vector<std::pair<int, std::vector<int>>> data;

void setup()
{
    std::vector<int> tmp({ 1,2,3,4,5 });
    for (int i = 0; i < 5; i++)
    {
        data.push_back({ i,tmp });
    }
}


void DoTask(int a, int b)
{
    {
#pragma omp critical
        std::cout << "Thread:" << omp_get_thread_num() << " Executing task with pair " << a << ',' << b << std::endl;
    }
}
// runs correctly under icl, but crashes at runtime with icx and clang
void crash_icx_2022()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (auto iter = data.begin(); iter != data.end(); ++iter)
            {
                const auto& a = iter->first;
                const auto& b = iter->second;
                for (const auto& aa : b)
                {
                    if (aa != a)
                    {
                        {
#pragma omp critical
                            std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
                        }
#   pragma omp task
                        {
                            DoTask(a, aa);
                        }
                    }
                }
            }
        }
    }
}


// this compiles and runs incorrectly under icl but runs correctly with icx or clang

void fails_intel_icl()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (auto iter = data.begin(); iter != data.end(); ++iter)
            {
                const auto a = iter->first;
                const auto b = iter->second;
                for (const auto aa : b)
                {
                    if (aa != a)
                    {
                        {
#pragma omp critical
                            std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
                        }
#   pragma omp task
                        {
                            DoTask(a, aa);
                        }
                    }
                }
            }
        }
    }
}


void testTaskingBug()
{
    setup();

    std::cout << "\nStarting test using copies\n" << std::endl;
    fails_intel_icl();
    std::cout << "\nStarting test using references" << std::endl;
    crash_icx_2022();
}
int main()
{
    testTaskingBug();
    return 0;
}

以下C ++ 17代码不会在Clang下编译。不确定错误是真实的。

void clang_wont_compile()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (const auto& [a, b] : data)
            {
                for (const auto& aa : b)
                {
                    if (aa != a)
                    {
#   pragma omp task
                        DoTask(a, aa);
                    }
                }
            }
        }
    }
}

The code below shows problems with with OpenMP tasking in ICL 2021.6.0 and in ICX 2022.1.0 (Clang based)
Firstly, I am wondering if I am doing something fundamentally wrong in my OpenMP code and it is just showing up differently when compiled by different compilers.
Assuming the code is valid OpenMP...
When the function fails_intel_icl() runs under ICL, the task execution is just wrong. Some task are run twice, some not at all. Compiled by ICX/Clang it executes as I expect.
When crash_icx_2022() is compiled under ICX it just crashes at runtime. I am testing using Visual Studio 20222/Debug/x64 and latest OneAPI Base and HPC installation.

Examples of incorrect runtime behaviour of the function fails_intel_icl() when compiled with ICL is as follows

Thread:12 launching task for 0,1 <--- you will note the task for pair 0,1 never runs.

Thread:12 launching task for 0,2

Thread:9 Executing task with pair 0,2
....

#include <iostream>
#include <vector>
#include <omp.h>

std::vector<std::pair<int, std::vector<int>>> data;

void setup()
{
    std::vector<int> tmp({ 1,2,3,4,5 });
    for (int i = 0; i < 5; i++)
    {
        data.push_back({ i,tmp });
    }
}


void DoTask(int a, int b)
{
    {
#pragma omp critical
        std::cout << "Thread:" << omp_get_thread_num() << " Executing task with pair " << a << ',' << b << std::endl;
    }
}
// runs correctly under icl, but crashes at runtime with icx and clang
void crash_icx_2022()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (auto iter = data.begin(); iter != data.end(); ++iter)
            {
                const auto& a = iter->first;
                const auto& b = iter->second;
                for (const auto& aa : b)
                {
                    if (aa != a)
                    {
                        {
#pragma omp critical
                            std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
                        }
#   pragma omp task
                        {
                            DoTask(a, aa);
                        }
                    }
                }
            }
        }
    }
}


// this compiles and runs incorrectly under icl but runs correctly with icx or clang

void fails_intel_icl()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (auto iter = data.begin(); iter != data.end(); ++iter)
            {
                const auto a = iter->first;
                const auto b = iter->second;
                for (const auto aa : b)
                {
                    if (aa != a)
                    {
                        {
#pragma omp critical
                            std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
                        }
#   pragma omp task
                        {
                            DoTask(a, aa);
                        }
                    }
                }
            }
        }
    }
}


void testTaskingBug()
{
    setup();

    std::cout << "\nStarting test using copies\n" << std::endl;
    fails_intel_icl();
    std::cout << "\nStarting test using references" << std::endl;
    crash_icx_2022();
}
int main()
{
    testTaskingBug();
    return 0;
}

The following C++17 code will not compile under clang. Not sure if the error is real.

void clang_wont_compile()
{
#   pragma omp parallel
    {
#   pragma omp single
        {
            for (const auto& [a, b] : data)
            {
                for (const auto& aa : b)
                {
                    if (aa != a)
                    {
#   pragma omp task
                        DoTask(a, aa);
                    }
                }
            }
        }
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

帝王念 2025-02-19 07:03:28

感谢您指出的。看起来确实应该是有效的OMP代码。也许是任务 +关键的后端上的某些内容,这将抛弃编译器和/或如果规格不允许,但似乎并非如此。

与一些OpenMP的人进行仔细检查,以查看我们是否对此有错误(或对行为的更好解释)。

thanks for pointing this out. It does look like it should be valid OMP code. Maybe something on the backend with the task + critical which is throwing off the compiler and/or if it was not allowed per the spec but doesn’t seem to be the case.

Double checking with some OpenMP folks to see if we have a bug on this (or a better explanation as to the behavior).

演出会有结束 2025-02-19 07:03:28

因此,经过更多的调查,我似乎有答案

  • OpenMP代码是有效的,并且功能的所有变化均应
    正确运行

  • iCl(Intel Classic)和ICX(基于Clang)具有一些错误,因为我已经测试了

  • ICL(Intel Classic)和ICX(基于clang)具有一些我已经使用

    测试
    解决问题并正确执行。

So after more investigation I seem to have answers

  • the OpenMP code is valid and all variations of the functions should
    run correctly

  • icl (intel classic) and icx (clang based) have some bugs as of the versions I have tested with

  • A newer clang compiler I able to test with (14.0.6) has
    resolved the issues and executes correctly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文