Matlab 中的矢量化循环 - 性能问题
这个问题与这两个相关:
MATLAB 向量化简介 - 有什么好的教程吗?
同时使用两个数组中的元素的过滤器时间
根据我阅读的教程,我试图对一些需要花费大量时间的过程进行矢量化。
我已经将其重写
function B = bfltGray(A,w,sigma_r)
dim = size(A);
B = zeros(dim);
for i = 1:dim(1)
for j = 1:dim(2)
% Extract local region.
iMin = max(i-w,1);
iMax = min(i+w,dim(1));
jMin = max(j-w,1);
jMax = min(j+w,dim(2));
I = A(iMin:iMax,jMin:jMax);
% Compute Gaussian intensity weights.
F = exp(-0.5*(abs(I-A(i,j))/sigma_r).^2);
B(i,j) = sum(F(:).*I(:))/sum(F(:));
end
end
为:
function B = rngVect(A, w, sigma)
W = 2*w+1;
I = padarray(A, [w,w],'symmetric');
I = im2col(I, [W,W]);
H = exp(-0.5*(abs(I-repmat(A(:)', size(I,1),1))/sigma).^2);
B = reshape(sum(H.*I,1)./sum(H,1), size(A, 1), []);
WhereA
是一个 512x512 矩阵w
是窗口大小的一半,通常等于 5sigma
是 [0 1] 范围内的参数(通常为:0.1、0.2 或 0.3 之一)
所以 I
矩阵将有 512x512x121 = 31719424 个元素
,但是这个版本似乎和第一个版本一样慢,而且它使用了大量的内存,有时会导致内存问题。
我想我做错了什么。可能存在一些关于矢量化的逻辑错误。好吧,事实上我并不感到惊讶 - 这种方法创建了非常大的矩阵,并且计算可能成比例地更长。
我还尝试使用 nlfilter 编写它(类似于 Jonas 给出的第二个解决方案),但似乎很难,因为我使用 Matlab 6.5 (R13)(没有可用的复杂函数句柄)。
因此,我再次要求的不是现成的解决方案,而是一些可以帮助我在合理的时间内解决此问题的想法。也许你会指出我做错了什么。
编辑:
正如 Mikhail 所建议的,分析结果如下:
65% 的时间花在 H= exp(...)
行上 25% 的时间被 im2col
使用
This question is related to these two:
Introduction to vectorizing in MATLAB - any good tutorials?
filter that uses elements from two arrays at the same time
Basing on the tutorials I read, I was trying to vectorize some procedure that takes really a lot of time.
I've rewritten this:
function B = bfltGray(A,w,sigma_r)
dim = size(A);
B = zeros(dim);
for i = 1:dim(1)
for j = 1:dim(2)
% Extract local region.
iMin = max(i-w,1);
iMax = min(i+w,dim(1));
jMin = max(j-w,1);
jMax = min(j+w,dim(2));
I = A(iMin:iMax,jMin:jMax);
% Compute Gaussian intensity weights.
F = exp(-0.5*(abs(I-A(i,j))/sigma_r).^2);
B(i,j) = sum(F(:).*I(:))/sum(F(:));
end
end
into this:
function B = rngVect(A, w, sigma)
W = 2*w+1;
I = padarray(A, [w,w],'symmetric');
I = im2col(I, [W,W]);
H = exp(-0.5*(abs(I-repmat(A(:)', size(I,1),1))/sigma).^2);
B = reshape(sum(H.*I,1)./sum(H,1), size(A, 1), []);
WhereA
is a matrix 512x512w
is half of the window size, usually equal 5sigma
is a parameter in range [0 1] (usually one of: 0.1, 0.2 or 0.3)
So the I
matrix would have 512x512x121 = 31719424 elements
But this version seems to be as slow as the first one, but in addition it uses a lot of memory and sometimes causes memory problems.
I suppose I've made something wrong. Probably some logic mistake regarding vectorizing. Well, in fact I'm not surprised - this method creates really big matrices and probably the computations are proportionally longer.
I have also tried to write it using nlfilter (similar to the second solution given by Jonas) but it seems to be hard since I use Matlab 6.5 (R13) (there are no sophisticated function handles available).
So once again, I'm asking not for ready solution, but for some ideas that would help me to solve this in reasonable time. Maybe you will point me what I did wrong.
Edit:
As Mikhail suggested, the results of profiling are as follows:
65% of time was spent in the line H= exp(...)
25% of time was used by im2col
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
I 和 H 有多大(即
numel(I)*8
字节)?如果您开始分页,那么第二个解决方案的性能将受到非常严重的影响。要测试是否确实因数组太大而出现问题,可以尝试使用数组
A
tic 和toc
来测量计算速度代码> 的大小不断增加。如果执行时间的增长速度快于 A 大小的平方,或者执行时间在A
的某个大小处跳跃,您可以尝试拆分填充的I
分成许多子数组并执行类似的计算。否则,我没有看到任何明显的地方会让你浪费很多时间。好吧,也许您可以通过在函数中将
B
替换为A
(也节省一点内存)来跳过重塑,然后编写A(:) = sum(H.*I,1)./sum(H,1);
您可能还想考虑升级到更新版本的 Matlab - 它们已经起作用了努力提高性能。
How big are I and H (i.e.
numel(I)*8
bytes)? If you start paging, then the performance of your second solution is going to be affected very badly.To test whether you really have a problem due to too large arrays, you can try and measure the speed of the calculation using
tic
andtoc
for arraysA
of increasing size. If the execution time increases faster than by the square of the size of A, or if the execution time jumps at some size ofA
, you can try and split the paddedI
into a number of sub-arrays and perform the calculations like that.Otherwise, I don't see any obvious places where you could be losing lots of time. Well, maybe you could skip the reshape, by replacing
B
withA
in your function (saves a little memory as well), and writingA(:) = sum(H.*I,1)./sum(H,1);
You may also want to look into upgrading to a more recent version of Matlab - they've worked hard on improving performance.