用于独立任务的 Fortran 和 OpenMP 线程组
我需要使用 OpenMP 运行两个独立的任务。其中一个比另一个更复杂,因此最好分割可用线程,以便更复杂的任务使用更多线程。这两个任务完成后,我需要使用它们的输出。我不完全确定这是否可以用 OpenMP 来完成,所以任何建议都会非常有用。
这是试图说明我需要什么。有两个独立的子例程,具有单独的输入和输出。子例程 mysub2
比 mysub1
更复杂。它有多个嵌套循环,因此有更多线程运行它会带来更多好处。在 6 个线程中,我想同时分配其中 2 个线程执行 mysub1
,并将其中 4 个线程分配给 mysub2
。获得每个子程序输出 z1
和 z2
后,它们都用于计算 z3
。
在这次尝试中,我尝试将线程 0 和 1 分配给任务 1,将其他 4 个线程分配给任务 2。显然,这不会按预期工作,因为它运行 mysub1
两次,mysub2
四次,但我不知道如何实现我需要的。
module mymod
implicit none
contains
subroutine mysub1(x1,y1,z1)
! Element-wise product of vectors
real,intent(in) :: x1(:),y1(:)
real,intent(out) :: z1(size(x1))
integer :: i
!$omp parallel do private(i)
do i = 1,size(x1)
z1(i) = x1(i) * y1(i)
end do
!$omp end parallel do
print *, 'Done with mysub1'
end subroutine mysub1
subroutine mysub2(x2,y2,z2)
! Matrix multiplication
real,intent(in) :: x2(:,:),y2(:,:)
real,intent(out) :: z2(size(x2,1),size(y2,2))
integer :: i,j
!$omp parallel do private(i,j)
do i = 1,size(x2,1)
do j = 1,size(y2,2)
z2(i,j) = dot_product(x2(i,:), y2(:,j))
end do
end do
!$omp end parallel do
print *, 'Done with mysub2'
end subroutine mysub2
end module mymod
program main
use omp_lib
use mymod
implicit none
integer :: tid
integer,parameter :: m = 2
integer,parameter :: n = 3
integer,parameter :: p = 4
real :: x1(m),y1(m),z1(m)
real :: x2(m,n),y2(n,p),z2(m,p),z3
! Setting total number of threads to 6
call omp_set_num_threads(6)
! Assigning arbitrary values for illustration purposes
x1 = 1.0
y1 = 2.0
x2 = 3.0
y2 = 4.0
!$omp parallel private(tid)
! Getting thread number
tid = omp_get_thread_num()
if ((tid == 0) .or. (tid == 1)) then
! Task 1 to be executed in two threads, tid = 0,1
call mysub1(x1,y1,z1)
else
! Task 2 to be executed in four threads, tid = 2,3,4,5
call mysub2(x2,y2,z2)
end if
!$omp end parallel
! Using z1 and z2 (serially, no need to parallelize)
z3 = sum(z1) + sum(z2)
print *, 'Final output', z3
end program main
当然,这只是一个例子。我知道我不需要使用 mysub2 来进行矩阵乘法。我只是想说明 mysub2 更复杂,因此,最好为其使用更多线程,而不必粘贴我拥有的数百行实际代码。
I need to run two independent tasks using OpenMP. One of them is way more involved than the other, so it would be ideal to split the available threads such that the more complicated task uses more of them. After these two tasks are finished, I need to use both of their outputs. I am not entirely sure if this can be done with OpenMP, so any suggestion would be very useful.
This is an attempt to illustrate what I need. There are two independent suboutines with separate inputs and outputs. Subroutine mysub2
is more complex than mysub1
. It has multiple nested loops, so it would benefit more from having more threads running it. Out of 6 threads, I would like to assign 2 of them to execute mysub1
, and 4 of them to mysub2
, simultaneously. After getting each subroutine outputs, z1
and z2
, both of them are used to compute z3
.
In this attempt I was trying to assign threads 0 and 1 to task 1, and the other 4 to task 2. Obviously, this doesn't work as intended because it runs mysub1
twice and mysub2
four times, but I have no idea how to achieve what I need.
module mymod
implicit none
contains
subroutine mysub1(x1,y1,z1)
! Element-wise product of vectors
real,intent(in) :: x1(:),y1(:)
real,intent(out) :: z1(size(x1))
integer :: i
!$omp parallel do private(i)
do i = 1,size(x1)
z1(i) = x1(i) * y1(i)
end do
!$omp end parallel do
print *, 'Done with mysub1'
end subroutine mysub1
subroutine mysub2(x2,y2,z2)
! Matrix multiplication
real,intent(in) :: x2(:,:),y2(:,:)
real,intent(out) :: z2(size(x2,1),size(y2,2))
integer :: i,j
!$omp parallel do private(i,j)
do i = 1,size(x2,1)
do j = 1,size(y2,2)
z2(i,j) = dot_product(x2(i,:), y2(:,j))
end do
end do
!$omp end parallel do
print *, 'Done with mysub2'
end subroutine mysub2
end module mymod
program main
use omp_lib
use mymod
implicit none
integer :: tid
integer,parameter :: m = 2
integer,parameter :: n = 3
integer,parameter :: p = 4
real :: x1(m),y1(m),z1(m)
real :: x2(m,n),y2(n,p),z2(m,p),z3
! Setting total number of threads to 6
call omp_set_num_threads(6)
! Assigning arbitrary values for illustration purposes
x1 = 1.0
y1 = 2.0
x2 = 3.0
y2 = 4.0
!$omp parallel private(tid)
! Getting thread number
tid = omp_get_thread_num()
if ((tid == 0) .or. (tid == 1)) then
! Task 1 to be executed in two threads, tid = 0,1
call mysub1(x1,y1,z1)
else
! Task 2 to be executed in four threads, tid = 2,3,4,5
call mysub2(x2,y2,z2)
end if
!$omp end parallel
! Using z1 and z2 (serially, no need to parallelize)
z3 = sum(z1) + sum(z2)
print *, 'Final output', z3
end program main
Of course, this is just an example. I know I don't need to use mysub2
to do matrix multiplication. I'm just trying to illustrate that mysub2
is more complex and hence, it would be ideal to use more threads for it, without having to paste several hundred lines of the actual code I have.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论