将指针限制为OpenMP中的函数参数?

发布于 2025-02-02 23:29:24 字数 975 浏览 3 评论 0原文

我不知道OpenMP是如何工作的,但是我认为如果可以通过多个线程共享对象,则在循环中呼叫带有限制的指针参数的函数无效?以以下串行代码的示例,旨在在矩阵列中执行加权总和:

const int n = 10; 
const double x[n][n] = {...};  // matrix, containing some numbers
const double w[n] = {...};     // weights, containing some numbers

// my weighted sum function
double mywsum(const double *restrict px, const double *restrict pw, const int n) {
    double tmp = 0.0;
    for(int i = 0; i < n; ++i) tmp += px[i] * pw[i];
    return tmp;
}

double res[n]; // results vector
const double *pw = &w[0]; // creating pointer to w 

// loop doing column-wise weighted sum
for(int j = 0; j < n; ++j) {
    res[j] = mywsum(&x[0][j], pw, n);
}

现在我想使用openMP并行化此循环,例如:

#pragma omp parallel for
for(int j = 0; j < n; ++j) {
    res[j] = mywsum(&x[0][j], pw, n);
}

我相信*限制px仍然可以作为特定元素有效指向一个线程只能一次访问一个线程,但是*限制PW应引起问题,因为W的元素由多个线程同时访问,因此代码>限制应该在此处删除条款?

I don't know how OpenMP works, but I presume calling a function with restricted pointer arguments inside a parallel for loop doesn't work if the objects could be shared by multiple threads? Take the following example of serial code meant to perform a weighted sum across matrix columns:

const int n = 10; 
const double x[n][n] = {...};  // matrix, containing some numbers
const double w[n] = {...};     // weights, containing some numbers

// my weighted sum function
double mywsum(const double *restrict px, const double *restrict pw, const int n) {
    double tmp = 0.0;
    for(int i = 0; i < n; ++i) tmp += px[i] * pw[i];
    return tmp;
}

double res[n]; // results vector
const double *pw = &w[0]; // creating pointer to w 

// loop doing column-wise weighted sum
for(int j = 0; j < n; ++j) {
    res[j] = mywsum(&x[0][j], pw, n);
}

Now I want to parallelize this loop using OpenMP, e.g.:

#pragma omp parallel for
for(int j = 0; j < n; ++j) {
    res[j] = mywsum(&x[0][j], pw, n);
}

I believe the *restrict px could still be valid as the particular elements pointed to can only be accessed by one thread at a time, but the *restrict pw should cause problems as the elements of w are accessed concurrently by multiple threads, so the restrict clause should be removed here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

骄兵必败 2025-02-09 23:29:24

我认为,如果对象可以通过多个线程共享,则在循环中呼叫具有限制指针参数的函数无法使用?

限制关键字完全独立于使用多个线程。它告诉编译器,指针目标是不是别名的对象,也就是由功能中的任何其他指针所引用的。它是为了避免在C中避免混音。其他线程可以调用该函数的事实不是问题。实际上,如果线程在同一位置写入,则会有一个更大的问题:A 种族条件。如果多个线程在同一位置读取,则这不是问题(有或没有限制关键字)。编译器基本上不关心多线程myWsum被编译时。由于没有锁,原子操作或内存屏障,因此可以忽略其他线程的效果。

我相信 *限制PX仍然可以有效,因为一个特定元素只能一次由一个线程访问,但是 *限制PW应引起问题,因为W的元素通过多个线程同时访问,因此,应在此处删除限制条款?

应该将其删除,因为它是没有用的,但不是因为它会引起任何问题。

在这里使用限制关键字不是很有用,因为编译器可以轻松地看到没有可能重叠。实际上,循环中唯一完成的存储是tmp,它是本地变量,输入参数不能指向tmp,因为它是本地变量。实际上,如果启用了优化,则编译器将在寄存器中存储tmp(因此,在实践中甚至没有地址)。

应该记住,限制绑定到定义的函数范围(即函数mywsum)。因此,在多线程上下文中的内部或使用该函数对限制关键字的结果没有影响。


我认为&amp; x [0] [j]是错误的,因为函数的循环通过n项目,并且指针开始为j - 项目。这意味着循环访问项目X [0] [J+N-1]理论上导致过度访问。在实践中,您不会观察到没有错误,因为2D C数组在内存中变平,&amp; x [0] [n]应等于&amp; x [1] [1] [0] 在您的情况下。结果肯定不是您想要的。

I presume calling a function with restricted pointer arguments inside a parallel for loop doesn't work if the objects could be shared by multiple threads?

The restrict keyword is totally independent of using multiple threads. It tells the compiler that the pointer target an object that is not aliased, that is, referenced by any other pointers in the function. It is meant to avoid aliasing in C. The fact that other threads can call the function is not a problem. In fact, if threads write in the same location, you have a much bigger problem: a race condition. If multiple threads read in the same location, this is not a problem (with or without the restrict keyword). The compiler basically does not care about multi-threading when the function mywsum is compiled. It can ignore the effect of other threads since there is no locks, atomic operations or memory barriers.

I believe the *restrict px could still be valid as the particular elements pointed to can only be accessed by one thread at a time, but the *restrict pw should cause problems as the elements of w are accessed concurrently by multiple threads, so the restrict clause should be removed here?

It should be removed because it is not useful, but not because it cause any issue.

The use of the restrict keyword is not very useful here since the compiler can easily see that there is no possible overlapping. Indeed, the only store done in the loop is the one of tmp which is a local variable and the input arguments cannot point on tmp because it is a local variable. In fact, compilers will store tmp in a register if optimizations are enabled (so it does not even have an address in practice).

One should keep in mind that restrict is bound to the function scope where it is define (ie. in the function mywsum). Thus, inlining or the use of the function in a multithreaded context have no impact on the result with respect to the restrict keyword.


I think &x[0][j] is wrong because the loop of the function iterate over n items and the pointer starts to the j-th item. This means the loop access to the item x[0][j+n-1] theoretically causing out-of-bound accesses. In practice you will observe no error because 2D C array are flatten in memory and &x[0][n] should be equal to &x[1][0] in your case. The result will certainly not what you want.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文