将指针限制为OpenMP中的函数参数?
我不知道OpenMP是如何工作的,但是我认为如果可以通过多个线程共享对象,则在循环中呼叫带有限制的指针参数的函数无效?以以下串行代码的示例,旨在在矩阵列中执行加权总和:
const int n = 10;
const double x[n][n] = {...}; // matrix, containing some numbers
const double w[n] = {...}; // weights, containing some numbers
// my weighted sum function
double mywsum(const double *restrict px, const double *restrict pw, const int n) {
double tmp = 0.0;
for(int i = 0; i < n; ++i) tmp += px[i] * pw[i];
return tmp;
}
double res[n]; // results vector
const double *pw = &w[0]; // creating pointer to w
// loop doing column-wise weighted sum
for(int j = 0; j < n; ++j) {
res[j] = mywsum(&x[0][j], pw, n);
}
现在我想使用openMP并行化此循环,例如:
#pragma omp parallel for
for(int j = 0; j < n; ++j) {
res[j] = mywsum(&x[0][j], pw, n);
}
我相信*限制px
仍然可以作为特定元素有效指向一个线程只能一次访问一个线程,但是*限制PW
应引起问题,因为W
的元素由多个线程同时访问,因此代码>限制应该在此处删除条款?
I don't know how OpenMP works, but I presume calling a function with restricted pointer arguments inside a parallel for loop doesn't work if the objects could be shared by multiple threads? Take the following example of serial code meant to perform a weighted sum across matrix columns:
const int n = 10;
const double x[n][n] = {...}; // matrix, containing some numbers
const double w[n] = {...}; // weights, containing some numbers
// my weighted sum function
double mywsum(const double *restrict px, const double *restrict pw, const int n) {
double tmp = 0.0;
for(int i = 0; i < n; ++i) tmp += px[i] * pw[i];
return tmp;
}
double res[n]; // results vector
const double *pw = &w[0]; // creating pointer to w
// loop doing column-wise weighted sum
for(int j = 0; j < n; ++j) {
res[j] = mywsum(&x[0][j], pw, n);
}
Now I want to parallelize this loop using OpenMP, e.g.:
#pragma omp parallel for
for(int j = 0; j < n; ++j) {
res[j] = mywsum(&x[0][j], pw, n);
}
I believe the *restrict px
could still be valid as the particular elements pointed to can only be accessed by one thread at a time, but the *restrict pw
should cause problems as the elements of w
are accessed concurrently by multiple threads, so the restrict
clause should be removed here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
限制
关键字完全独立于使用多个线程。它告诉编译器,指针目标是不是别名的对象,也就是由功能中的任何其他指针所引用的。它是为了避免在C中避免混音。其他线程可以调用该函数的事实不是问题。实际上,如果线程在同一位置写入,则会有一个更大的问题:A 种族条件。如果多个线程在同一位置读取,则这不是问题(有或没有限制
关键字)。编译器基本上不关心多线程myWsum
被编译时。由于没有锁,原子操作或内存屏障,因此可以忽略其他线程的效果。应该将其删除,因为它是没有用的,但不是因为它会引起任何问题。
在这里使用
限制
关键字不是很有用,因为编译器可以轻松地看到没有可能重叠。实际上,循环中唯一完成的存储是tmp
,它是本地变量,输入参数不能指向tmp
,因为它是本地变量。实际上,如果启用了优化,则编译器将在寄存器中存储tmp
(因此,在实践中甚至没有地址)。应该记住,
限制
绑定到定义的函数范围(即函数mywsum
)。因此,在多线程上下文中的内部或使用该函数对限制
关键字的结果没有影响。我认为
&amp; x [0] [j]
是错误的,因为函数的循环通过n
项目,并且指针开始为j
- 项目。这意味着循环访问项目X [0] [J+N-1]理论上导致过度访问。在实践中,您不会观察到没有错误,因为2D C数组在内存中变平,&amp; x [0] [n]
应等于&amp; x [1] [1] [0] 在您的情况下。结果肯定不是您想要的。
The
restrict
keyword is totally independent of using multiple threads. It tells the compiler that the pointer target an object that is not aliased, that is, referenced by any other pointers in the function. It is meant to avoid aliasing in C. The fact that other threads can call the function is not a problem. In fact, if threads write in the same location, you have a much bigger problem: a race condition. If multiple threads read in the same location, this is not a problem (with or without therestrict
keyword). The compiler basically does not care about multi-threading when the functionmywsum
is compiled. It can ignore the effect of other threads since there is no locks, atomic operations or memory barriers.It should be removed because it is not useful, but not because it cause any issue.
The use of the
restrict
keyword is not very useful here since the compiler can easily see that there is no possible overlapping. Indeed, the only store done in the loop is the one oftmp
which is a local variable and the input arguments cannot point ontmp
because it is a local variable. In fact, compilers will storetmp
in a register if optimizations are enabled (so it does not even have an address in practice).One should keep in mind that
restrict
is bound to the function scope where it is define (ie. in the functionmywsum
). Thus, inlining or the use of the function in a multithreaded context have no impact on the result with respect to therestrict
keyword.I think
&x[0][j]
is wrong because the loop of the function iterate overn
items and the pointer starts to thej
-th item. This means the loop access to the item x[0][j+n-1] theoretically causing out-of-bound accesses. In practice you will observe no error because 2D C array are flatten in memory and&x[0][n]
should be equal to&x[1][0]
in your case. The result will certainly not what you want.