C++ 中的 OpenMP 分段错误

发布于 2025-01-11 03:09:26 字数 1656 浏览 3 评论 0原文

我有一个非常简单的函数,可以计算 N by N 2D 矩阵(由指针 arr 表示)的内部条目数低于 a特定阈值,并更新通过引用传递的计数器 below_threshold

void count(float *arr, const int N, const float threshold, int &below_threshold) {
    below_threshold = 0;  // make sure it is reset
    bool comparison;
    float temp;
    
    #pragma omp parallel for shared(arr, N, threshold) private(temp, comparison) reduction(+:below_threshold)
    for (int i = 1; i < N-1; i++)  // count only the inner N-2 rows
    {
        for (int j = 1; j < N-1; j++)  // count only the inner N-2 columns
        {
            temp = *(arr + i*N + j);
            comparison = (temp < threshold);
            below_threshold += comparison;
        }
    }
}

当我不使用 OpenMP 时,它运行良好(因此,分配和初始化已正确完成)。

当我使用 N 小于 40000 左右的 OpenMP 时,它运行良好。

然而,一旦我开始在 OpenMP 中使用更大的 N ,它就会不断给我带来分段错误(我目前正在使用 N = 50000 进行测试,并希望最终能够将其恢复)至~100000)。

在软件层面上这有什么问题吗?


PS 分配是动态完成的( float *arr = new float [N*N] ),这里是用于随机初始化整个矩阵的代码,这对于 OpenMP 来说没有任何问题Large N

void initialize(float *arr, const int N)
{
    #pragma omp parallel for
    for (int i = 0; i < N; i++)
    {
        for (int j = 0; j < N; j++)
        {
            *(arr + i*N + j) = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
        }
    }

}

更新:

我尝试将 ijN 更改为 long long int,它仍然没有修复我的分段错误。如果这是问题所在,为什么它在没有 OpenMP 的情况下仍然可以工作?只有当我添加 #pragma omp ... 时,它才会失败。

I have a very straightforward function that counts how many inner entries of an N by N 2D matrix (represented by a pointer arr) is below a certain threshold, and updates a counter below_threshold that is passed by reference:

void count(float *arr, const int N, const float threshold, int &below_threshold) {
    below_threshold = 0;  // make sure it is reset
    bool comparison;
    float temp;
    
    #pragma omp parallel for shared(arr, N, threshold) private(temp, comparison) reduction(+:below_threshold)
    for (int i = 1; i < N-1; i++)  // count only the inner N-2 rows
    {
        for (int j = 1; j < N-1; j++)  // count only the inner N-2 columns
        {
            temp = *(arr + i*N + j);
            comparison = (temp < threshold);
            below_threshold += comparison;
        }
    }
}

When I do not use OpenMP, it runs fine (thus, the allocation and initialization were done correctly already).

When I use OpenMP with an N that is less than around 40000, it runs fine.

However, once I start using a larger N with OpenMP, it keeps giving me a segmentation fault (I am currently testing with N = 50000 and would like to eventually get it up to ~100000).

Is there something wrong with this at a software level?


P.S. The allocation was done dynamically ( float *arr = new float [N*N] ), and here is the code used to randomly initialize the entire matrix, which didn't have any issues with OpenMP with large N:

void initialize(float *arr, const int N)
{
    #pragma omp parallel for
    for (int i = 0; i < N; i++)
    {
        for (int j = 0; j < N; j++)
        {
            *(arr + i*N + j) = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
        }
    }

}

UPDATE:

I have tried changing i, j, and N to long long int, and it still has not fixed my segmentation fault. If this was the issue, why has it already worked without OpenMP? It is only once I add #pragma omp ... that it fails.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

菩提树下叶撕阳。 2025-01-18 03:09:26

我认为,这是因为,你的值(50000*50000 = 2500000000)在c++中达到了INT_MAX(2147483647)以上。因此,数组访问行为将是未定义的。

因此,您应该使用 UINT_MAX 或一些适合您的用例的其他类型。

I think, it is because, your value (50000*50000 = 2500000000) reached above INT_MAX (2147483647) in c++. As a result, the array access behaviour will be undefined.

So, you should use UINT_MAX or some other types that suits with your usecase.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文