OpenMP我有虚假共享还是种族状况?

发布于 2025-01-29 16:57:37 字数 2010 浏览 3 评论 0原文

我正在学习OpenMP,并且正在研究压缩的稀疏行乘法(Datatype std :: Complex< int>)。每次运行以下函数时,我都会收到不同的执行时间:

typedef std::vector < std::vector < std::complex < int >>> matrix;

struct CSR {
    std::vector<std::complex<int>> values; //non-zero values
    std::vector<int> row_ptr; //pointers of rows
    std::vector<int> cols_index; //indices of columns
    int rows; //number of rows
    int cols; //number of columns
    int NNZ; //number of non_zero elements
};

const matrix multiply_omp (const CSR& A,
    const CSR& B) {
    if (A.cols != B.rows)
        throw "Error";
    CSR B_t = sparse_transpose(B);
    matrix result(A.rows, std::vector < std::complex < int >>(B.rows, 0));
    #pragma omp parallel
    {
        #pragma omp for
        for (int i = 0; i < A.rows; i++) {
            for (int j = A.row_ptr[i]; j < A.row_ptr[i + 1]; j++) {
                int Ai = A.cols_index[j];
                std::complex<int> Avalue = A.values[j];
                for (int k = 0; k < B_t.rows; k++) {
                    std::complex < int > sum(0, 0);
                    for (int l = B_t.row_ptr[k]; l < B_t.row_ptr[k + 1]; l++)
                        if (Ai == B_t.cols_index[l]) {
                            sum += Avalue * B_t.values[l];
                            break;
                        }
                    if (sum != std::complex < int >(0, 0)) {
                        result[i][k] += sum;
                    }
                }
            }

        }
    }

    return result;
}

我设置了一个用于循环的函数10迭代,从而给它提供1000*1000矩阵,并使用过op_get_wtime(),这是结果:

iteration 1 : 0.751642 s
iteration 2 : 0.911264 s
iteration 3 : 1.553695 s
iteration 4 : 0.761839 s
iteration 5 : 0.603688 s
iteration 6 : 0.423919 s
iteration 7 : 0.423114 s
iteration 8 : 0.445878 s
iteration 9 : 0.892305 s
iteration 10 : 0.918682 s

正常吗?还是我有虚假的共享或比赛状况?

I'm learning Openmp, and I'm working on compressed sparse row multiplication (datatype std::complex<int>). And I'm getting different execution time each time I run the following function:

typedef std::vector < std::vector < std::complex < int >>> matrix;

struct CSR {
    std::vector<std::complex<int>> values; //non-zero values
    std::vector<int> row_ptr; //pointers of rows
    std::vector<int> cols_index; //indices of columns
    int rows; //number of rows
    int cols; //number of columns
    int NNZ; //number of non_zero elements
};

const matrix multiply_omp (const CSR& A,
    const CSR& B) {
    if (A.cols != B.rows)
        throw "Error";
    CSR B_t = sparse_transpose(B);
    matrix result(A.rows, std::vector < std::complex < int >>(B.rows, 0));
    #pragma omp parallel
    {
        #pragma omp for
        for (int i = 0; i < A.rows; i++) {
            for (int j = A.row_ptr[i]; j < A.row_ptr[i + 1]; j++) {
                int Ai = A.cols_index[j];
                std::complex<int> Avalue = A.values[j];
                for (int k = 0; k < B_t.rows; k++) {
                    std::complex < int > sum(0, 0);
                    for (int l = B_t.row_ptr[k]; l < B_t.row_ptr[k + 1]; l++)
                        if (Ai == B_t.cols_index[l]) {
                            sum += Avalue * B_t.values[l];
                            break;
                        }
                    if (sum != std::complex < int >(0, 0)) {
                        result[i][k] += sum;
                    }
                }
            }

        }
    }

    return result;
}

I set a for loop to call the function 10 iterations giving it 1000*1000 matrices and used omp_get_wtime(), and here is the result:

iteration 1 : 0.751642 s
iteration 2 : 0.911264 s
iteration 3 : 1.553695 s
iteration 4 : 0.761839 s
iteration 5 : 0.603688 s
iteration 6 : 0.423919 s
iteration 7 : 0.423114 s
iteration 8 : 0.445878 s
iteration 9 : 0.892305 s
iteration 10 : 0.918682 s

is that normal? or do I have false sharing or Race condition?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文