如何与openMP平行处理浮点以正确

发布于 2025-02-09 05:05:26 字数 1957 浏览 1 评论 0原文

经过数小时的调试和小规模的例子，试图找到自己做错了什么，我终于得到了我的理论之一。浮点四舍五入错误。浮点计算是非相关的计算机

什么可以这样做以减少错误吗？

代码：

                        double energy = 0;
                        double contrast = 0;
                        double homogeneity = 0;
                        double entropy = 0;
                        double correlation = 0;
                        double shade = 0;
                        double prominence = 0;
                        double glcmMean = 0;
                        double sigma = 0;
                        double squaredVarianceIntensity = 0;
                        double A = 0;
                        double B = 0;
                        
                        for(int c = 0; c <normalizedGlcm.cols; c++){
                            #pragma omp parallel for reduction(+:homogeneity,energy,contrast,entropy,glcmMean)
                            for(int r = 0; r<normalizedGlcm.rows; r++){

                                double pij = normalizedGlcm.at<double>(r,c,0);
                                double intensity = (double)img.at<uchar>(col,row,0);

                                if(pij != 0){
                                    homogeneity += pij/(1.0+((c-r)*(c-r)));
                                    energy += pij * pij;
                                    contrast += (c-r)*(c-r)*pij;
                                    entropy += -(log(pij)*pij); // pij will never be under 0
                                    glcmMean += pij * intensity;
                                }
                            }
                        }

之后，Glcmmean变量有更多的循环和其他一些计算。到目前为止，我只有Glcmmean变量才会出现错误。错误示例：

serial     -      parrallel
1.66905e+28 vs 1.55964e+30
4.09033e+28 vs 3.62704e+30
8.38877e+30 vs 3.35551e+31

原文

After hours of debugging and small scale example in attempts to find what I was doing wrong, I finally got one of my theory confirmed.
floating point rounding-off error.
floating point calculations are non-associative in computers

what can be done to reduce the errors ?

code :

                        double energy = 0;
                        double contrast = 0;
                        double homogeneity = 0;
                        double entropy = 0;
                        double correlation = 0;
                        double shade = 0;
                        double prominence = 0;
                        double glcmMean = 0;
                        double sigma = 0;
                        double squaredVarianceIntensity = 0;
                        double A = 0;
                        double B = 0;
                        
                        for(int c = 0; c <normalizedGlcm.cols; c++){
                            #pragma omp parallel for reduction(+:homogeneity,energy,contrast,entropy,glcmMean)
                            for(int r = 0; r<normalizedGlcm.rows; r++){

                                double pij = normalizedGlcm.at<double>(r,c,0);
                                double intensity = (double)img.at<uchar>(col,row,0);

                                if(pij != 0){
                                    homogeneity += pij/(1.0+((c-r)*(c-r)));
                                    energy += pij * pij;
                                    contrast += (c-r)*(c-r)*pij;
                                    entropy += -(log(pij)*pij); // pij will never be under 0
                                    glcmMean += pij * intensity;
                                }
                            }
                        }

after that bit there's more loops and some other calculations with the glcmMean variable. And so far I only get error with the glcmMean variable.
error examples :

serial     -      parrallel
1.66905e+28 vs 1.55964e+30
4.09033e+28 vs 3.62704e+30
8.38877e+30 vs 3.35551e+31

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

執念 2025-02-16 05:05:26

基于注释中的信息，您只需尝试以下累积glcmmean的代码，这样累积的值比初始代码中的数量级更多。此假设normalizedglcm.cols和normolizedglcm.rows相对较近（例如，不是2和2000）。

double energy = 0;
double contrast = 0;
double homogeneity = 0;
double entropy = 0;
double correlation = 0;
double shade = 0;
double prominence = 0;
double glcmMean = 0;
double sigma = 0;
double squaredVarianceIntensity = 0;
double A = 0;
double B = 0;

for(int c = 0; c <normalizedGlcm.cols; c++){
    double local_homogeneity = 0;
    double local_energy = 0;
    double local_contrast = 0;
    double local_entropy = 0;
    double local_glcmMean = 0;

    #pragma omp parallel for reduction(+:local_homogeneity,local_energy,local_contrast,local_entropy,local_glcmMean)
    for(int r = 0; r<normalizedGlcm.rows; r++){
        double pij = normalizedGlcm.at<double>(r,c,0);
        double intensity = (double)img.at<uchar>(col,row,0);

        if(pij != 0){
            local_homogeneity += pij/(1.0+((c-r)*(c-r)));
            local_energy += pij * pij;
            local_contrast += (c-r)*(c-r)*pij;
            local_entropy += -(log(pij)*pij); // pij will never be under 0
            local_glcmLocalMean += pij * intensity;
        }
    }

    homogeneity += local_homogeneity;
    energy += local_energy;
    contrast += local_contrast;
    entropy += local_entropy;
    glcmMean += local_glcmMean;
}

如果问题是由于FP饱和度 /不精确，则应强烈提高结果的准确性，尤其是在顺序中。

Based on the information in the comments, you can simply try the following code that accumulate the glcmMean so the accumulated value are more on the same order of magnitude than in the initial code. This assume normalizedGlcm.cols and normalizedGlcm.rows are relatively close (eg. not 2 and 2000 for example).

double energy = 0;
double contrast = 0;
double homogeneity = 0;
double entropy = 0;
double correlation = 0;
double shade = 0;
double prominence = 0;
double glcmMean = 0;
double sigma = 0;
double squaredVarianceIntensity = 0;
double A = 0;
double B = 0;

for(int c = 0; c <normalizedGlcm.cols; c++){
    double local_homogeneity = 0;
    double local_energy = 0;
    double local_contrast = 0;
    double local_entropy = 0;
    double local_glcmMean = 0;

    #pragma omp parallel for reduction(+:local_homogeneity,local_energy,local_contrast,local_entropy,local_glcmMean)
    for(int r = 0; r<normalizedGlcm.rows; r++){
        double pij = normalizedGlcm.at<double>(r,c,0);
        double intensity = (double)img.at<uchar>(col,row,0);

        if(pij != 0){
            local_homogeneity += pij/(1.0+((c-r)*(c-r)));
            local_energy += pij * pij;
            local_contrast += (c-r)*(c-r)*pij;
            local_entropy += -(log(pij)*pij); // pij will never be under 0
            local_glcmLocalMean += pij * intensity;
        }
    }

    homogeneity += local_homogeneity;
    energy += local_energy;
    contrast += local_contrast;
    entropy += local_entropy;
    glcmMean += local_glcmMean;
}

If the problem was due to the FP saturation / imprecision, then it should strongly improve the accuracy of the result, especially in sequential.

回复收藏 0 原文

~没有更多了~