OpenCV中的高斯滤波器算法是如何工作的
我写了自己的高斯滤波器,但它真的很慢。
OpenCV的高斯算法快得多,比我的高斯滤波器快20倍。 我想在我的项目中重写OpenCV的高斯算法,并且我不想在我的项目中包含opencv。
然而,
谁能给我算法描述,opencv的源代码 似乎很难理解?
I write my own gaussian filter but it is really slow.
OpenCV's Gaussian algorithm is much faster, 20 times than my gaussian filter.
I want to rewrite OpenCV's Gaussian algorithm in my project, and I don't want to include opencv in my project.
However,
Can anyone give me the algorithm description, opencv's source code
seems too hard to understand?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
高斯滤波器有一个非常容易加速的特性:滤波器可以独立地应用于两个维度。 您定义一个垂直操作的一维过滤器和另一个水平操作的一维过滤器,然后应用它们; 这产生与在二维中应用单个滤波器相同的效果。
除此之外,您可能需要查看 SIMD 说明 例如 SSE3 适用于您的处理器。
The Gaussian filter has a property that makes it very easy to speed up: the filter can be applied in both dimensions independently. You define a one-dimensional filter that operates vertically, and another that operates horizontally, and apply them both; this produces the same effect as a single filter applied in two dimensions.
Beyond that, you'll probably need to look at the SIMD instructions e.g. SSE3 available for your processor.
为了回答问题的第二部分,高斯模糊只是将 3 维高斯表面用作图像上的卷积核。 维基百科对算法本身有很好的参考,但基本上,你采用高斯的值曲线并将其转换为方阵,然后将其乘以图像中的每个像素,例如:(
请注意,这只是一个示例内核,有非常具体的方程式,根据您的高斯变量,您会得到不同的结果结果)
为了回答问题的性能部分,假设图像大小恒定,该算法的整体速度将取决于一些因素。 假设图像是 NxM 像素,卷积核是 PxP 像素。 您将必须执行 PPN*M 次操作。 P 越大,您需要对给定图像执行的操作就越多。 您可以巧妙地使用此处使用的算法,进行非常具体的基于行或列的数学运算。
实施也非常重要。 如果您想要极其高效,您可能需要使用您的架构提供的最先进的指令。 如果您使用的是 Intel x86 芯片,您可能需要考虑获取 Intel 性能原语 (IPP) 的许可证并直接调用这些指令。 IIRC,OpenCV 确实会在 IPP 可用时使用它......
如果给定架构上的浮点性能很差,您也可以做一些非常聪明的事情并使用所有缩放的整数。 这可能会加快速度,但在走这条路之前我会先考虑其他选择。
To answer the second part of your question, a Gaussian blur is simply the a 3-d gaussian surface applied as a convolution kernel over the image. Wikipedia has a great reference on the algorithm itself, but basically, you take the values of a Gaussian curve and convert that into a square matrix, and multiply it by every single pixel in your image, e.g.:
(Note that this is just a sample kernel, there are very specific eqns which, depending on your Gaussian variables, you'll get different results)
To answer the performance part of your question, the overall speed of this algorithm would depend on a few things, assuming a constant sized image. Lets say the image is NxM pixels, and the convolution kernel is PxP pixels. You're going to have to do PPN*M operations. The greater P, the more operations you're going to have to do for a given image. You can get crafty with the algorithm you use here, doing very specific row or columnar based math.
Implementation is also very important. If you want to be extremely efficient, you'll probably want to use the most advanced instructions that your architecture offers. If you're using an Intel x86 chip, you'll probably want to look at getting a license for Intel performance primitives (IPP) and calling those instructions directly. IIRC, OpenCV does make use of IPP when its available...
You could also do something very smart and work with all scaled integers if the floating point performance on your given architecture is poor. This would probably speed things up a bit, but I would look at other options first before going down this road.
尝试检查此处。 您想提前计算出离散高斯矩阵,然后将其与图像进行卷积。
Try checking here. You want to figure out the discrete gaussian matrix ahead of time, then convolve it with the image.
如果您的卷积核相对较大并且正在实现直接卷积,则性能差异可能是因为 OpenCV 使用快速傅里叶变换 (FFT) 来实现卷积。
If your convolution kernel is relatively large and you are implementing direct convolution, the performance difference may be because OpenCV is implementing convolution using a fast Fourier transform (FFT).
我讨厌迂腐,但你要求的是一种算法,即完成一项任务所需的精确步骤序列。 您已经有了高斯算法。 因此,您问题的关键点是当您要求更快的东西时,这与要求算法不同。
要回答更快的问题 - 您想知道 OpenCV 如何优化其代码,这是一个技术性很强且广泛的主题。 我会冒险猜测它使用汇编语言和 GPU 特定的函数。 我首先学习汇编,并研究 CUDA 包以利用 GPU。
I hate to be pedantic, but you are asking for an algorithm, that is, a precise sequence of steps needed to accomplish a task. You already have the gaussian algorithm. So the key point of your question is when you ask for something faster, which is not the same as asking for an algorithm.
To answer the faster question - you want to know how OpenCV optimizes its code, which is a highly technical and broad subject. I would hazard a guess by saying it uses assembly language, and GPU-specific functions. I'd start by learning assembly, and researching the CUDA package to take advantage of your GPU.