使用 Cuda 测试多个阵列的组合

发布于 2025-01-02 06:31:01 字数 1126 浏览 0 评论 0原文

我用 php 编写了以下代码,并一直在阅读 Cuda 以利用我的旧 Geforce 8800 Ultra 的 GPU 处理能力。如何将此嵌套组合测试转换为 Cuda 并行处理代码(如果可能的话......)?二维数组的总组合:$a、$b、$c、$d、$e 迅速增加到数万亿......

foreach($a as $aVal){
    foreach($b as $bVal){
        foreach($c as $cVal){
            foreach($d as $dVal){
                foreach($e as $eVal){

                    $addSum = $aVal[0]+$bVal[0]+$cVal[0]+$dVal[0]+$eVal[0];
                    $capSum = $aVal[1]+$bVal[1]+$cVal[1]+$dVal[1]+$eVal[1];
                    if($capSum <= CAP_LIMIT){
                        $tempArr = array("a" => $aVal[2],"b" => $aVal[2],"c" => $aVal[2],
                        "d" => $aVal[2],"e" => $aVal[2],"addTotal" => $addSum,"capTotal" => $capSum);

                        array_push($topCombinations, $tempArr);

                        if(count($topCombinations) > 1000){
                           $topCombinations = $ca->arraySortedDescend($topCombinations);
                           array_splice($topCombinations, 900);

                        }
                    }  
                }
            }
        }
    }
}

I have the below code written in php and have been reading up on Cuda to utilize the GPU processing power of my old Geforce 8800 Ultra. How do I convert this nested combinations test to Cuda parallel processing code (if even possible...)? The total combinations of the 2d arrays: $a, $b, $c, $d, $e quickly rise into the trillions...

foreach($a as $aVal){
    foreach($b as $bVal){
        foreach($c as $cVal){
            foreach($d as $dVal){
                foreach($e as $eVal){

                    $addSum = $aVal[0]+$bVal[0]+$cVal[0]+$dVal[0]+$eVal[0];
                    $capSum = $aVal[1]+$bVal[1]+$cVal[1]+$dVal[1]+$eVal[1];
                    if($capSum <= CAP_LIMIT){
                        $tempArr = array("a" => $aVal[2],"b" => $aVal[2],"c" => $aVal[2],
                        "d" => $aVal[2],"e" => $aVal[2],"addTotal" => $addSum,"capTotal" => $capSum);

                        array_push($topCombinations, $tempArr);

                        if(count($topCombinations) > 1000){
                           $topCombinations = $ca->arraySortedDescend($topCombinations);
                           array_splice($topCombinations, 900);

                        }
                    }  
                }
            }
        }
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

可遇━不可求 2025-01-09 06:31:02

这是一个非常开放的问题。它需要语言之间的转换以及设计并行算法。我不会讨论太多细节,但简而言之:

如何并行化取决于数组的大小($a - $e)。如果它们足够大,您可以仅跨网格中的线程并行化外部一两个循环,并按顺序执行内部循环。如果它们不是很大,您可能需要展平 2-3 个外部循环,或者可能使用 CUDA 中的 2D 或 3D 线程块和网格来实现它们。

This is a very wide-open question. It requires conversion between languages as well as designing a parallel algorithm. I won't go into too much detail, but in a nutshell:

How you parallelize it depends on the size of your arrays ($a - $e). If they are large enough, you could parallelize only the outer one or two loops across threads in a grid, and do the inner loops sequentially. If they are not super large, you might want to either flatten 2-3 of the outer loops or possibly implement them using 2D or 3D thread blocks and grids in CUDA.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文