C++ 中的 OpenMP 分段错误
我有一个非常简单的函数,可以计算 N
by N
2D 矩阵(由指针 arr
表示)的内部条目数低于 a特定阈值,并更新通过引用传递的计数器 below_threshold
:
void count(float *arr, const int N, const float threshold, int &below_threshold) {
below_threshold = 0; // make sure it is reset
bool comparison;
float temp;
#pragma omp parallel for shared(arr, N, threshold) private(temp, comparison) reduction(+:below_threshold)
for (int i = 1; i < N-1; i++) // count only the inner N-2 rows
{
for (int j = 1; j < N-1; j++) // count only the inner N-2 columns
{
temp = *(arr + i*N + j);
comparison = (temp < threshold);
below_threshold += comparison;
}
}
}
当我不使用 OpenMP 时,它运行良好(因此,分配和初始化已正确完成)。
当我使用 N
小于 40000 左右的 OpenMP 时,它运行良好。
然而,一旦我开始在 OpenMP 中使用更大的 N
,它就会不断给我带来分段错误(我目前正在使用 N = 50000
进行测试,并希望最终能够将其恢复)至~100000)。
在软件层面上这有什么问题吗?
PS 分配是动态完成的( float *arr = new float [N*N]
),这里是用于随机初始化整个矩阵的代码,这对于 OpenMP 来说没有任何问题Large N
:
void initialize(float *arr, const int N)
{
#pragma omp parallel for
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
*(arr + i*N + j) = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
}
}
}
更新:
我尝试将 i
、j
和 N
更改为 long long int
,它仍然没有修复我的分段错误。如果这是问题所在,为什么它在没有 OpenMP 的情况下仍然可以工作?只有当我添加 #pragma omp ...
时,它才会失败。
I have a very straightforward function that counts how many inner entries of an N
by N
2D matrix (represented by a pointer arr
) is below a certain threshold, and updates a counter below_threshold
that is passed by reference:
void count(float *arr, const int N, const float threshold, int &below_threshold) {
below_threshold = 0; // make sure it is reset
bool comparison;
float temp;
#pragma omp parallel for shared(arr, N, threshold) private(temp, comparison) reduction(+:below_threshold)
for (int i = 1; i < N-1; i++) // count only the inner N-2 rows
{
for (int j = 1; j < N-1; j++) // count only the inner N-2 columns
{
temp = *(arr + i*N + j);
comparison = (temp < threshold);
below_threshold += comparison;
}
}
}
When I do not use OpenMP, it runs fine (thus, the allocation and initialization were done correctly already).
When I use OpenMP with an N
that is less than around 40000, it runs fine.
However, once I start using a larger N
with OpenMP, it keeps giving me a segmentation fault (I am currently testing with N = 50000
and would like to eventually get it up to ~100000).
Is there something wrong with this at a software level?
P.S. The allocation was done dynamically ( float *arr = new float [N*N]
), and here is the code used to randomly initialize the entire matrix, which didn't have any issues with OpenMP with large N
:
void initialize(float *arr, const int N)
{
#pragma omp parallel for
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
*(arr + i*N + j) = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
}
}
}
UPDATE:
I have tried changing i
, j
, and N
to long long int
, and it still has not fixed my segmentation fault. If this was the issue, why has it already worked without OpenMP? It is only once I add #pragma omp ...
that it fails.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为,这是因为,你的值(50000*50000 = 2500000000)在c++中达到了
INT_MAX
(2147483647)以上。因此,数组访问行为将是未定义的。因此,您应该使用
UINT_MAX
或一些适合您的用例的其他类型。I think, it is because, your value (50000*50000 = 2500000000) reached above
INT_MAX
(2147483647) in c++. As a result, the array access behaviour will be undefined.So, you should use
UINT_MAX
or some other types that suits with your usecase.