OpenMP目标卸载矩阵乘法编译错误
我当前正在尝试使用OpenMP目标卸载实现2 nxn
矩阵的简单矩阵乘法。该代码取自在这里
template<typename T>
void multiplyJIK(T *A, T *B, T *C, uint64_t size) {
#pragma omp target data device(0) map(to: A[0:size*size], B[0:size * size], size) map(tofrom: C[0:size * size])
{
#pragma omp target teams device(0) num_teams(32768) thread_limit(512) \
map(to: A[0:size*size], B[0:size * size], size) map(tofrom: C[0:size * size]) \
default(none) shared(A, B, C, size)
#pragma omp distribute parallel for num_threads(512) dist_schedule(static, 512) \
default(none) shared(A, B, C, size)
for (uint64_t j = 0; j < size; ++j) {
for (uint64_t i = 0; i < size; ++i) {
for (uint64_t k = 0; k < size; ++k) {
C[i * size + j] += A[i * size + k] * B[k * size + j];
}
}
}
}
}
应乘以2个矩阵a
和b
并将结果存储在c
中。矩阵表示为长度size * size
的OnEdimensional阵列。
对于我的测试,t
是float
,我尝试使用NVHPC工具包编译代码:nvc ++ -std = C ++ 17 -MP = GPU- target = gpu main.cpp -o matmul
并获取此错误:
error: item must appear in a SHARED or PRIVATE clause:
C[i * size + j] += A[i * size + k] * B[k * size + j];
^
detected during instantiation of "void Target::multiplyJIK(T *, T *, T *, uint64_t) [with T=float]"
我不明白此错误,因为应该正确映射C数组(map(tofrom:c ...)
)和存在于共享(...)
子句中。我是否缺少代码中的某些内容,或者这是编译标志的问题?
I am currently trying to implement a simple matrix multiplication of 2 nxn
matrices using OpenMP target offloading. The code is taken from here:
template<typename T>
void multiplyJIK(T *A, T *B, T *C, uint64_t size) {
#pragma omp target data device(0) map(to: A[0:size*size], B[0:size * size], size) map(tofrom: C[0:size * size])
{
#pragma omp target teams device(0) num_teams(32768) thread_limit(512) \
map(to: A[0:size*size], B[0:size * size], size) map(tofrom: C[0:size * size]) \
default(none) shared(A, B, C, size)
#pragma omp distribute parallel for num_threads(512) dist_schedule(static, 512) \
default(none) shared(A, B, C, size)
for (uint64_t j = 0; j < size; ++j) {
for (uint64_t i = 0; i < size; ++i) {
for (uint64_t k = 0; k < size; ++k) {
C[i * size + j] += A[i * size + k] * B[k * size + j];
}
}
}
}
}
It should multiply the 2 matrices A
and B
and store the results in C
. The matrices are represented as onedimensional arrays of length size * size
.
For my test, T
is a float
and I try to compile the code using the nvhpc toolkit: nvc++ -std=c++17 -mp=gpu -target=gpu main.cpp -o matmul
and get this error:
error: item must appear in a SHARED or PRIVATE clause:
C[i * size + j] += A[i * size + k] * B[k * size + j];
^
detected during instantiation of "void Target::multiplyJIK(T *, T *, T *, uint64_t) [with T=float]"
I dont understand this error as the C array should be correctly mapped (map(tofrom: C...)
) and is present in the shared(...)
clause. Am I missing something in the code or is this a problem with the compile flags?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论