并行生成分区

发布于 2025-01-02 06:49:05 字数 287 浏览 1 评论 0原文

我正在使用一种算法（用 C 实现）来生成集合的分区。（代码在这里：http://www.martinbroadhurst.com/combinatorial-algorithms.html #分区）。

我想知道是否有办法修改这个算法以并行而不是线性运行？

我的 CPU 上有多个核心，并且希望将分区的生成分成多个运行线程。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

メ斷腸人バ 2025-01-09 06:49:05

初始化一个包含前 k 个元素的每个分区的共享集合。每个线程，直到集合为空，重复从集合中删除一个分区，并使用您链接到的算法生成剩余 n - k 元素的所有可能性（当增加当前 n 元素分区时，获取另一个 k 元素前缀将改变前 k 个元素中的一个）。

回复收藏 0 原文

江心雾 2025-01-09 06:49:05

正如您所看到的，您提到的算法在基数 n 中创建计数器，并且每次将具有相同编号的项目放在一组中，并以这种方式对输入进行分区。

每个计数器从 0 计数到 (0,1,2,...,n-1)，这意味着 A=n-1+(n-2)*n+ ...+1*n^n-1+0 个数字。因此，您可以在 k 个不同的线程上运行算法，在第一个线程中，您应该从 0 计数到 A/k，在第二个线程中，您应该从 (A/k)+1 计数到 2*A/k，依此类推。意味着您应该添加一个 long 变量并用上限（在 for 循环条件中）检查它，同时计算 的 A 值和基本 n 格式的相关数字r*A/k 为 0 <= r <= k。

回复收藏 0 原文

不顾 2025-01-09 06:49:05

首先，考虑串行算法的以下变体。获取元素a，并将其分配给子集#0（这始终有效，因为分区内子集的顺序并不重要）。下一个元素b 可能属于与a 相同的子集，也可能属于不同的子集，即属于子集#1。然后，元素 c 属于 #0（与 a 一起）或 #1（如果与 分开，则与 b 一起） a），或者它自己的子集（如果#0={a,b}，则为#1；如果#0={，则为#2 a} 和#1={b})。等等。因此，您将新元素一一添加到部分构建的分区中，为每个输入生成一些可能的输出 - 直到您放入所有元素。并行化的关键是每个不完整分区可以独立地附加新元素，即与所有其他变体并行。

该算法可以以不同的方式实现。我将使用一种递归方法，其中给一个函数一个部分填充的数组及其当前长度，根据下一个元素的可能值（比数组当前的最后一个值多一个）复制该数组多次），将下一个元素设置为每个可能的值，并为每个新数组递归地调用自身，并增加长度。这种方法似乎特别适合窃取工作的并行引擎，例如 cilk 或tbb。类似于 @swen 建议的实现也是可能的：您使用所有不完整分区的集合和线程池，每个线程从集合中获取一个分区，生成所有可能的扩展并将它们放回集合中；添加了所有元素的分区显然应该进入不同的集合。

回复收藏 0 原文

玩套路吗 2025-01-09 06:49:05

这是我使用 swen 的建议获得的 C++ 实现。线程的数量取决于 r 的值。对于 r=6，分区数是第六个铃数，等于 203。对于 r=0，我们只是得到一个正常的非并行程序。

#include "omp.h"
#include <bits/stdc++.h>
using namespace std;
typedef long long lli;

const int MAX=10010;
const int MX=100;
int N,r=6;

int F[MAX]; // partitions first r
int Fa[MAX][MX]; // complete partitions
int P[MAX]; // first appearances first r
int Pa[MAX][MX]; // first appearances complete

int next(){// iterates to next partition of first r
    for(int i=r-1;i>=0;i--){
        P[F[i]]=i;
    }
    for(int i=r-1;i>=0;i--){
        if( P[F[i]]!=i ){
            F[i]++;
            for(int j=i+1;j<r;j++){
                F[j]=0;
            }
            return(1);
        } 
    }
    return(0);
}

int sig(int ID){// iterates to next partition in thread
    for(int i=N-1;i>=0;i--){
        Pa[ID][Fa[ID][i]]=i;
    }
    for(int i=N-1;i>=r;i--){
        if( Pa[ID][Fa[ID][i]]!=i){
            Fa[ID][i]++;
            for(int j=i+1;j<N;j++){
                Fa[ID][j]=0;
            }
            return(1);
        } 
    }
    return(0);
}

int main(){
    int N;
    scanf("%d",&N);
    int t=1,partitions=0;
    while(t || next() ){// save the current partition so we can use it for a thread later
        t=0;
        for(int i=0;i<r;i++){
            Fa[partitions][i]=F[i];
        }
        partitions++;
    }
    omp_set_num_threads(partitions);
        #pragma omp parallel
    {
        int ID = omp_get_thread_num();
        int t=1;
        while(t || sig(ID) ){// iterate through each partition in the thread
            // the current partition in the thread is found in Fa[ID]
        }
    }
}

Here is the c++ implementation I obtained using swen's suggestion. The number of threads depends on the value of r. For r=6 the number of partitions is the sixth bell number, which is equal to 203. For r=0 we just get a normal non-parallel program.

#include "omp.h"
#include <bits/stdc++.h>
using namespace std;
typedef long long lli;

const int MAX=10010;
const int MX=100;
int N,r=6;

int F[MAX]; // partitions first r
int Fa[MAX][MX]; // complete partitions
int P[MAX]; // first appearances first r
int Pa[MAX][MX]; // first appearances complete

int next(){// iterates to next partition of first r
    for(int i=r-1;i>=0;i--){
        P[F[i]]=i;
    }
    for(int i=r-1;i>=0;i--){
        if( P[F[i]]!=i ){
            F[i]++;
            for(int j=i+1;j<r;j++){
                F[j]=0;
            }
            return(1);
        } 
    }
    return(0);
}

int sig(int ID){// iterates to next partition in thread
    for(int i=N-1;i>=0;i--){
        Pa[ID][Fa[ID][i]]=i;
    }
    for(int i=N-1;i>=r;i--){
        if( Pa[ID][Fa[ID][i]]!=i){
            Fa[ID][i]++;
            for(int j=i+1;j<N;j++){
                Fa[ID][j]=0;
            }
            return(1);
        } 
    }
    return(0);
}

int main(){
    int N;
    scanf("%d",&N);
    int t=1,partitions=0;
    while(t || next() ){// save the current partition so we can use it for a thread later
        t=0;
        for(int i=0;i<r;i++){
            Fa[partitions][i]=F[i];
        }
        partitions++;
    }
    omp_set_num_threads(partitions);
        #pragma omp parallel
    {
        int ID = omp_get_thread_num();
        int t=1;
        while(t || sig(ID) ){// iterate through each partition in the thread
            // the current partition in the thread is found in Fa[ID]
        }
    }
}

回复收藏 0 原文

~没有更多了~