Matlab 中向量化循环的双重求和

发布于 2024-11-10 16:50:55 字数 963 浏览 7 评论 0原文

我想向量化这个双 for 循环，因为它是我的代码中的瓶颈。由于 Matlab 是一种基于 1 的索引语言，我必须为 M = 0 创建一个附加项。

R、r、lambda 是常量

Slm(L,M)、Clm(L,M) 是矩阵 70x70

Plm(L,M)是矩阵 70x71

Cl(L),Pl(L) 是向量 70x1

% function dU_r
  s1 = 0;
  for L = 2:70
     s1 = s1 + ((R/r)^L)*(L+1)*Pl(L)*Cl(L);
     for m = 1:L
        s1 = s1 + ((R/r)^L)*(L+1)*Plm(L,M)*(Clm(L,M)*...
cos(M*lambda) + Slm(L,M)*sin(M*lambda));
     end
  end

  dU_r = -(mu_p/(r^2))*s1;


  % function dU_phi

  s2=0;
  for L = 2:70
      s2 = s2 + ((R/r)^L))*Plm(L,1)*Cl(L);
          for m = 1:l
          s2 = s2 + ((R/r)^L)*(Plm(L,M+1)-M*tan(phi)*Plm(L,M))*...
  (Clm(L,M)*cos(M*lambda) + Slm(L,M)*sin(M*lambda));
          end;
  end;
  dU_phi = (mu_p/r)*s2;

  % function dU_lambda

  s3=0;
      for L=2:70
          for m=1:L
          s3 = s3 + ((R/r)^L)*M*Plm(L,M)*(Slm(L,M)*cos(M*lambda)...
              - Clm(L,M)*sin(M*lambda));
          end;
  dU_lambda = (mu_p/r)*s3;

原文

I want to vectorize this double for loop because it is a bottleneck in my code. Since Matlab is a based-one indexing language I have to create an additional term for M = 0.

R,r,lambda are constants

Slm(L,M),Clm(L,M) are matrices 70x70

Plm(L,M) is a matrix 70x71

Cl(L),Pl(L) are vectors 70x1

% function dU_r
  s1 = 0;
  for L = 2:70
     s1 = s1 + ((R/r)^L)*(L+1)*Pl(L)*Cl(L);
     for m = 1:L
        s1 = s1 + ((R/r)^L)*(L+1)*Plm(L,M)*(Clm(L,M)*...
cos(M*lambda) + Slm(L,M)*sin(M*lambda));
     end
  end

  dU_r = -(mu_p/(r^2))*s1;


  % function dU_phi

  s2=0;
  for L = 2:70
      s2 = s2 + ((R/r)^L))*Plm(L,1)*Cl(L);
          for m = 1:l
          s2 = s2 + ((R/r)^L)*(Plm(L,M+1)-M*tan(phi)*Plm(L,M))*...
  (Clm(L,M)*cos(M*lambda) + Slm(L,M)*sin(M*lambda));
          end;
  end;
  dU_phi = (mu_p/r)*s2;

  % function dU_lambda

  s3=0;
      for L=2:70
          for m=1:L
          s3 = s3 + ((R/r)^L)*M*Plm(L,M)*(Slm(L,M)*cos(M*lambda)...
              - Clm(L,M)*sin(M*lambda));
          end;
  dU_lambda = (mu_p/r)*s3;

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

莫言歌 2024-11-17 16:50:55

我认为以下代码可能是解决方案的一部分（仅部分矢量化），对于完整的矢量化，您可能需要查看 meshgrid、ndgrid、bsxfun 和/或 repmat，我想。

s2     = 0;
L      = 2:70;
PlCl   = Pl.*Cl;
PlmClm = Plm.*Clm;
Rrl    = (R/r).^(L).*(L+1);
COS    = cos((1:70)*lambda);
SIN    = sin((1:70)*lambda);

for l=L
    M  = 1:l;
    s1 = sum(Rrl(l-1)*PlmClm(l,M).*(COS(M) + Slm(l,M).*SIN(M)));
    s2 = s2 + s1;
end;
s2 = s2 + sum(Rrl(L).*PlCl(L));

我还没有尝试运行此代码，因此它可能会抱怨某些尺寸。你应该能够弄清楚这一点（只需在这里或那里进行一些转置就可以了）。

附带说明一下：尽量避免使用短变量名（当然还有像 l 这样不明确的变量名（可能会被误读为 1、I 或 l）。此外，它可能是如果您有实际的（即未编码的）表达式来开始，则更容易矢量化您的代码

：应用 gnovice 建议的校正。

I think the following code may be a part of the solution (only partially vectorized), for a full vectorization you might want to look at meshgrid, ndgrid, bsxfun and/or repmat, I suppose.

s2     = 0;
L      = 2:70;
PlCl   = Pl.*Cl;
PlmClm = Plm.*Clm;
Rrl    = (R/r).^(L).*(L+1);
COS    = cos((1:70)*lambda);
SIN    = sin((1:70)*lambda);

for l=L
    M  = 1:l;
    s1 = sum(Rrl(l-1)*PlmClm(l,M).*(COS(M) + Slm(l,M).*SIN(M)));
    s2 = s2 + s1;
end;
s2 = s2 + sum(Rrl(L).*PlCl(L));

I haven't tried to run this code, so if it may complain about some dimensions. You should be able to figure that out (just some transposes here or there may do).

Just a note on the side: try to keep away from short variable names (and certainly ambiguous ones like l (that can be misread for a 1, a I or an l). Also, it might be easier to vectorize your code if you have actual (i.e. not coded) expressions to start with.

edit: applied corections suggested by gnovice

回复收藏 0 原文

焚却相思 2024-11-17 16:50:55

对于这种特殊情况，不完全矢量化您的解决方案可能会更有效。正如 Egon 所示，可以通过矢量化替换内部循环，但替换外部 for 循环也可能会减慢您的解决方案。

原因是什么？如果您注意如何在循环中对矩阵 Plm、Clm 和 Slm 进行索引，您将只使用其中的下三角部分。主对角线上方的所有值都将被忽略。通过完全矢量化您的解决方案，使其使用矩阵运算而不是循环，您最终会与矩阵的清零上三角部分执行不必要的乘法。在这种情况下，for 循环可能是更好的选择。

因此，这是 Egon 的答案在执行数学运算的数量方面应该接近最佳值：

index = 2:70;
coeff = ((R/r).^index).*(index+1);
lambda = lambda.*(1:70).';  %'
cosVector = cos(lambda);
sinVector = sin(lambda);
s2 = coeff*(Pl(index).*Cl(index));
for L = index
  M = 1:L;
  s2 = s2 + coeff(L-1)*((Plm(L,M).*Clm(L,M))*cosVector(M) + ...
                        (Plm(L,M).*Slm(L,M))*sinVector(M));
end

For this particular situation, it will likely be more efficient for you to not fully vectorize your solution. As Egon illustrates, it's possible to replace the inner loop through vectorization, but replacing the outer for loop as well could potentially slow-down your solution.

The reason? If you pay attention to how you index the matrices Plm, Clm, and Slm in your loop, you only ever use values from the lower triangular parts of them. All of the values above the main diagonal are ignored. By fully vectorizing your solution, such that it uses matrix operations instead of a loop, you would end up performing unnecessary multiplications with the zeroed-out upper triangular portions of the matrices. This is one of those cases where a for loop may be a better option.

As such, here's a refined and corrected version of Egon's answer that should be close to optimum with regard to the number of mathematical operations performed:

index = 2:70;
coeff = ((R/r).^index).*(index+1);
lambda = lambda.*(1:70).';  %'
cosVector = cos(lambda);
sinVector = sin(lambda);
s2 = coeff*(Pl(index).*Cl(index));
for L = index
  M = 1:L;
  s2 = s2 + coeff(L-1)*((Plm(L,M).*Clm(L,M))*cosVector(M) + ...
                        (Plm(L,M).*Slm(L,M))*sinVector(M));
end

回复收藏 0 原文

暗喜 2024-11-17 16:50:55

参考 Jan Simon 1 给出的解决方案，我有为我的问题编写了以下代码

Rr = R/r;
RrL = Rr;  
cosLambda = cos((1:70)* lambda);
sinLambda = sin((1:70)* lambda);
u1 = uint8(1);
s1 = 0;
s2 = 0;
s3 = 0;
for j = uint8(2):uint8(70)
   RrL = RrL * Rr;
   q1 = RrL * (double(j) + 1);
   t1 = Pl(j) * datos.Cl(j);
   q2 = RrL;
   t2 = Plm(j,1) * Cl(j);
   t3 = 0;
   for m = u1:j
      t1 = t1 + Plm(j,m) * ...
         (Clm(j, m) * cosLambda(m) + ...
          Slm(j, m) * sinLambda(m));
        t2 = t2 + (Plm(j,m+1)-double(m)*tan_phi*Plm(j,m))*...
    (Clm(j,m)*cosLambda(m) + Slm(j,m)*sinLambda(m));
        t3 = t3 + double(m)*Plm(j,m)*(Slm(j,m)*cosLambda(m)...
                - Clm(j,m)*sinLambda(m));
     end
     s1 = s1 + q1 * t1;
     s2 = s2 + q2 * t2;
     s3 = s3 + q2 * t3;
  end
dU_r = -(mu_p/(r^2))*s1;
dU_phi = (mu_p/r)*s2;
dU_lambda = (mu_p/r)*s3

Taking like reference a solution given by Jan Simon 1, I have written the following code for my problem

Rr = R/r;
RrL = Rr;  
cosLambda = cos((1:70)* lambda);
sinLambda = sin((1:70)* lambda);
u1 = uint8(1);
s1 = 0;
s2 = 0;
s3 = 0;
for j = uint8(2):uint8(70)
   RrL = RrL * Rr;
   q1 = RrL * (double(j) + 1);
   t1 = Pl(j) * datos.Cl(j);
   q2 = RrL;
   t2 = Plm(j,1) * Cl(j);
   t3 = 0;
   for m = u1:j
      t1 = t1 + Plm(j,m) * ...
         (Clm(j, m) * cosLambda(m) + ...
          Slm(j, m) * sinLambda(m));
        t2 = t2 + (Plm(j,m+1)-double(m)*tan_phi*Plm(j,m))*...
    (Clm(j,m)*cosLambda(m) + Slm(j,m)*sinLambda(m));
        t3 = t3 + double(m)*Plm(j,m)*(Slm(j,m)*cosLambda(m)...
                - Clm(j,m)*sinLambda(m));
     end
     s1 = s1 + q1 * t1;
     s2 = s2 + q2 * t2;
     s3 = s3 + q2 * t3;
  end
dU_r = -(mu_p/(r^2))*s1;
dU_phi = (mu_p/r)*s2;
dU_lambda = (mu_p/r)*s3

回复收藏 0 原文

~没有更多了~