霍夫变换的累加器填充

发布于 2024-10-03 22:28:05 字数 478 浏览 10 评论 0原文

我写了一段需要优化的代码。只是想与社区核实一下该代码是否确实是最佳的。它填充霍夫变换的累加器。实际上,我只是复制粘贴了 OpenCV 库中的大部分代码。谢谢!


int i,j,n,index;
for (i = 0;i<numrows;i++)
{
    for (j = 0;j<numcols;j++)
    {
            if (img[i*numcols + j] == 100)
        {
            for (n = 300;n<600;n++)
            {   
                index = cvRound(j*tabCos[n] + i * tabSin[n]) + (numrho-1)/2;
                accum[(n+1) * (numrho+2) + index+1]++;
            }
        }
    }
}

I wrote a piece of code that needs to be optimized. Just felt like checking with community to see if that code is indeed optimal. It fills up the accumulator for the Hough transform. I actually just copy pasted most the code from the OpenCV library. Thanks!


int i,j,n,index;
for (i = 0;i<numrows;i++)
{
    for (j = 0;j<numcols;j++)
    {
            if (img[i*numcols + j] == 100)
        {
            for (n = 300;n<600;n++)
            {   
                index = cvRound(j*tabCos[n] + i * tabSin[n]) + (numrho-1)/2;
                accum[(n+1) * (numrho+2) + index+1]++;
            }
        }
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

断舍离 2024-10-10 22:28:05

我也模糊地附加了一段代码,其中有一个大型且重复的霍夫变换。这部分代码的维护者一直在尝试使用稀疏数组(实际上是一个 C++ std::map ,如果我理解他的演示正确的话,键在单元格索引上)作为累加器,并取得了一些成功。

我认为加速与缓存局部性问题有关,并且它当然取决于数据的稀疏性。


更新:上面引用的软件旨在为许多粒子物理实验提供服务,但最初用于试验台项目(即小规模)。当我们开始认真做更大的项目并开始为它们做蒙特卡洛时,即使对于稀疏矩阵,霍夫变换也再次成为了一个瓶颈。

到目前为止,我们还没有解决方案,但一位同事发现 Gandalf 其中包括 "快速霍夫变换",它似乎以类似四叉树的方式评估变换(在 2D 中,大概您在 3D 中使用八叉树)来减少工作顺序。我们可能会对此进行实验。

进一步更新:一位同事最终在我们的代码中实现了渐进式概率霍夫变换,这似乎是我们目前最快的版本。如果您不需要将每个点分配给一条线,则效果最好。

There is a large and repetitive Hough transform in a piece of code I'm vaguely attached too. The maintainer of that part of the code has been experimenting with sparse arrays (actually a C++ std::map keyed on the cell index if I understood his presentation right) for the accumulator with some success.

I presume the speed up is related to cache locality issues, and it certainly depends on the data being sparse.


Update: The software referenced above is intended to serve many particle physics experiments, but was originally used on a test-bed project (i.e. small scale). As we've gotten serious about doing larger projects and started doing Monte Carlo for them, the Hough transform has gotten to be a bit of a bottle neck again even with the sparse matrix.

As yet we do not have a solution, but one of colleagues found Gandalf which includes "fast hough transform", which appears to evaluate the transform in a quad-tree like way (in 2D, presumably you use a oct-tree in 3D) to reduce the order of work. We're probably going to experiment with this.

Further update: A colleague eventual implemented a progressive, probabilistic Hough transform in our code which currently seems to be the fastest version we've got. Works best if you don't require that every point gets assigned to a line.

新雨望断虹 2024-10-10 22:28:05

不,不是。通过简单的指针算术来替换尽可能多的 [] 用法,以迭代有问题的数组。将不变表达式抽象为局部变量。

但是,第一个问题是,您的分析器是否显示此代码是整个应用程序上下文中的瓶颈。如果不是,为什么还要进行微优化呢?

编辑:循环微优化 - 更喜欢第二个,因为不需要数组索引(乘与加)

int ints[100];
int i;
int *pi;

for (i = 0; i < 100; ++i)
{
  printf("%d", ints[i]);
}

for (pi = ints; pi < ints + 100; ++pi)
{
  printf("%d", *pi);
}

No it's not. Replace as many of the [] usages as you can by simple pointer arithmetic to iterate the arrays in question. Abstract out invariant expressions into local variables.

However, the first question is, does your profiler show that this code is a bottleneck in the context of your entire app. If not, why bother micro-optimizing this?

EDIT: loop micro-optimization - prefer the second as no array indexing required (mult versus add)

int ints[100];
int i;
int *pi;

for (i = 0; i < 100; ++i)
{
  printf("%d", ints[i]);
}

for (pi = ints; pi < ints + 100; ++pi)
{
  printf("%d", *pi);
}
烟织青萝梦 2024-10-10 22:28:05

根据您的应用程序,有多种方法可以优化霍夫变换,而摆弄低级代码可能是最后一种方法。
我将从随机 HT 或多分辨率 HT 开始,然后采用混合方法合并。
我认为最好先优化算法。最后一步是使用 CAD 内存等硬件优化。

Depending on your application, there are numerous way to optimise Hough Transform and fiddling with low-level code is possibly the last of them.
I would start with Randomised HT or Multiresolution HT, followed Hybrid approach merge.
I believe it is better to optimised algorithm first. Last step would be do use hardware optimisation like CAD memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文