对大量矩阵进行并行化或向量化所有对抗操作?

发布于 2024-09-02 09:02:17 字数 1299 浏览 5 评论 0原文

我有大约 5,000 个矩阵,它们具有相同的行数和不同的列数 (20 x ~200)。这些矩阵中的每一个都必须在动态规划算法中与其他矩阵进行比较。

在这个问题中,我问如何快速进行比较并得到了涉及 2D 卷积的出色答案。串行地,迭代地应用该方法,

list = who('data_matrix_prefix*')
H = cell(numel(list),numel(list));  
for i=1:numel(list)
    for j=1:numel(list)
        if i ~= j
            eval([ 'H{i,j} = compare(' char(list(i)) ',' char(list(j)) ');']);
        end
    end
end

对于数据的小子集来说是快速的(例如,对于 9 个矩阵,9*9 - 9 = 72 次调用在 ~1 秒内进行,870 次调用在 ~2.5 秒内进行)。
然而,对所有数据进行操作需要近 2500 万次调用。
我还尝试使用 deal() 来创建一个完全由数据中的下一个元素组成的元胞数组,因此我可以在单个循环中使用 cellfun() :

# who(), load() and struct2cell() calls place k data matrices in a 1D cell array called data.
nextData = cell(k,1);
for i=1:k
    [nextData{:}] = deal(data{i});
    H{:,i} = cellfun(@compare,data,nextData,'UniformOutput',false);
end

不幸的是,这实际上并没有更快,因为所有时间都在比较中()。这两个代码示例似乎都不适合并行化。我无法弄清楚如何对变量进行切片。
Compare() 是完全矢量化的;它专门使用矩阵乘法和 conv2() (我的印象是所有这些操作,包括 cellfun(),都应该在 MATLAB 中是多线程的?)。

有人看到(明确的)并行解决方案或更好的问题矢量化吗?

注意
我意识到我的两个例子都是低效的 - 如果第一个例子计算三角形元胞数组,速度会是原来的两倍,而第二个例子仍然在计算自我比较。但是,良好的并行化所节省的时间更像是 16 倍(如果我在每个人的机器上安装 MATLAB,则为 72 倍)。

旁白
还有一个内存问题。我使用了几次评估将 H 的每一列附加到文件中,名称如 H1、H2 等,然后清除 Hi。不幸的是,保存速度非常慢......

I have approximately 5,000 matrices with the same number of rows and varying numbers of columns (20 x ~200). Each of these matrices must be compared against every other in a dynamic programming algorithm.

In this question, I asked how to perform the comparison quickly and was given an excellent answer involving a 2D convolution. Serially, iteratively applying that method, like so

list = who('data_matrix_prefix*')
H = cell(numel(list),numel(list));  
for i=1:numel(list)
    for j=1:numel(list)
        if i ~= j
            eval([ 'H{i,j} = compare(' char(list(i)) ',' char(list(j)) ');']);
        end
    end
end

is fast for small subsets of the data (e.g. for 9 matrices, 9*9 - 9 = 72 calls are made in ~1 s, 870 calls in ~2.5 s).
However, operating on all the data requires almost 25 million calls.
I have also tried using deal() to make a cell array composed entirely of the next element in data, so I could use cellfun() in a single loop:

# who(), load() and struct2cell() calls place k data matrices in a 1D cell array called data.
nextData = cell(k,1);
for i=1:k
    [nextData{:}] = deal(data{i});
    H{:,i} = cellfun(@compare,data,nextData,'UniformOutput',false);
end

Unfortunately, this is not really any faster, because all the time is in compare(). Both of these code examples seem ill-suited for parallelization. I'm having trouble figuring out how to make my variables sliced.
compare() is totally vectorized; it uses matrix multiplication and conv2() exclusively (I am under the impression that all of these operations, including the cellfun(), should be multithreaded in MATLAB?).

Does anyone see a (explicitly) parallelized solution or better vectorization of the problem?

Note
I realize both my examples are inefficient - the first would be twice as fast if it calculated a triangular cell array, and the second is still calculating the self comparisons, as well. But the time savings for a good parallelization are more like a factor of 16 (or 72 if I install MATLAB on everyone's machines).

Aside
There is also a memory issue. I used a couple of evals to append each column of H into a file, with names like H1, H2, etc. and then clear Hi. Unfortunately, the saves are very slow...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

比忠 2024-09-09 09:02:17

compare(a,b) == compare(b,a)

如果

compare(a,a) == 1

,则将循环更改

for i=1:numel(list)
    for j=1:numel(list)
    ...
    end
end

for i=1:numel(list)
    for j= i+1 : numel(list)
    ...
    end
end

并处理对称性和恒等性情况。这将使您的计算时间减少一半。

Does

compare(a,b) == compare(b,a)

and

compare(a,a) == 1

If so, change your loop

for i=1:numel(list)
    for j=1:numel(list)
    ...
    end
end

to

for i=1:numel(list)
    for j= i+1 : numel(list)
    ...
    end
end

and deal with the symmetry and identity case. This will cut your calculation time by half.

鹤仙姿 2024-09-09 09:02:17

第二个示例可以轻松地切片以与并行处理工具箱一起使用。该工具箱将代码迭代分布在最多 8 个不同的本地处理器之间。如果你想在集群上运行代码,你还需要分布式计算工具箱。

%# who(), load() and struct2cell() calls place k data matrices in a 1D cell array called data.

parfor i=1:k-1 %# this will run the loop in parallel with the parallel processing toolbox
    %# only make the necessary comparisons
    H{i+1:k,i} = cellfun(@compare,data(i+1:k),repmat(data(i),k-i,1),'UniformOutput',false);

    %# if the above doesn't work, try this
    hSlice = cell(k,1);
    hSlice{i+1:k} = cellfun(@compare,data(i+1:k),repmat(data(i),k-i,1),'UniformOutput',false);
    H{:,i} = hSlice;
end

The second example can be easily sliced for use with the Parallel Processing Toolbox. This toolbox distributes iterations of your code among up to 8 different local processors. If you want to run the code on a cluster, you also need the Distributed Computing Toolbox.

%# who(), load() and struct2cell() calls place k data matrices in a 1D cell array called data.

parfor i=1:k-1 %# this will run the loop in parallel with the parallel processing toolbox
    %# only make the necessary comparisons
    H{i+1:k,i} = cellfun(@compare,data(i+1:k),repmat(data(i),k-i,1),'UniformOutput',false);

    %# if the above doesn't work, try this
    hSlice = cell(k,1);
    hSlice{i+1:k} = cellfun(@compare,data(i+1:k),repmat(data(i),k-i,1),'UniformOutput',false);
    H{:,i} = hSlice;
end
生生不灭 2024-09-09 09:02:17

如果我理解正确的话,你必须执行 5000^2 矩阵比较?也许您应该考虑您的问题是由 5000^2 个任务组成,而不是尝试并行比较函数? Matlab 并行计算工具箱支持基于任务的并行性。不幸的是,我在 PCT 方面的经验是大型线性代数类型问题的并行化,因此我无法告诉您更多信息。该文档无疑会给您带来更多帮助。

If I understand correctly you have to perform 5000^2 matrix comparisons ? Rather than try to parallelise the compare function, perhaps you should think of your problem being composed of 5000^2 tasks ? The Matlab Parallel Compute Toolbox supports task-based parallelism. Unfortunately my experience with PCT is with parallelisation of large linear algebra type problems so I can't really tell you much more than that. The documentation will undoubtedly help you more.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文