FFT 图像来测量相似度
好吧,我正在编写一个小型 Java 应用程序,它接受两个图像作为输入,对它们进行比较,然后给出定量输出作为相似度的度量(例如 50% 相似)。
据我了解,FFT 是衡量两个图像相似度的好方法。但看在上帝的份上,我无法弄清楚如何编码/实现它。
到目前为止,我已经实现了另一个函数,它基本上给了我两个直方图(每个图像一个)。我现在需要的只是编写一种方法,对图像进行 FFT 并给出可量化的结果。
谁能帮我解决这个问题吗?我真的很想看到一些示例代码,如果不是至少是正确方向的一点。非常感谢。
Ok I'm writing a small Java app that accepts two images as inputs, compares them, then gives a quantitative output as a measure of similarity (eg. 50% similar).
To my understanding FFT is a good way to measure similarity of two images. But I can't for the love of god figure out how to code/implement it.
So far I've implemented another function which basically gives me two histograms (one for each image). All I need now is to write a method that will FFT an image and give me a quantifiable outcome.
Can anyone help me out with this? I'd really like to see some sample codes, if not at least a point in the right direction. Much thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
相似性并不是一个精确的术语。例如:如果有圆形和椭圆形,它们相似吗?它们都是圆形物体,所以从这个意义上说它们是圆形物体 - 但如果我们只想过滤掉圆形物体,那么它们就不是圆形物体。您必须定义一个度量(或多个度量 - 例如圆度、强度分布、大小、方向、对象数量、欧拉数等),然后为每个图像计算它。两个图像的相似度将是两个计算值之间的(某种)距离。这可能是欧几里德距离(对于两个实际测量),或某种误差函数(对于强度分布的 RMS)。
您必须选择哪些变换应该使您的测量保持不变(旋转后的图像是否与原始图像相似?如果是,则简单的傅立叶变换不合适)。
测量图像的相似度很困难,如果你必须这样做,我会阅读图像拼接。如果您只需要区分 BLOB,首先尝试计算一些简单的度量(我建议计算矩 - 面积、方向;阅读 K-means 聚类),或轮廓到质心距离的一维傅里叶变换(这有点困难)。
Similarity is not an exact term. For example: if you have circle, and an ellipse are they similar? They are both round objects, so in this sense they are - but if we want to filter out circles only they are not. You will have to define a measure (or measures - for example roundness, intensity distribution, size, orientation, number of objects, euler number, etc.), than calculate it for each image. The similarity of the two images will be (some kind of) distance between the two calculated values. This could be euclidean distance (for two real measures), or some kind of error function (RMS for intensity distributions).
You will have to choose to which transforms should your measure stay invariant (is the rotated image similar to the original? If yes, simple fourier transform is not appropriate).
Measuring similarity of an image is hard, if you have to do that I would read about image stitching. If you just need to distinguish BLOB-s, first try to calculate some simple measures (I recommend calculating moments - area, orientation; read K-means clusteing), or 1D fourier transform of the distance of the contour from the center of the mass (whic is a little bit more difficult).
在尝试编写 2DFT 代码之前,您应该完全理解其背后的数学原理。 flolo 是正确的,您可以通过首先对行和列进行 1D FFT,然后组合结果来计算它,但我没有理由相信 L_inf 范数是将它们转换为度量的最佳方法,因为它完全跳过创建完整 2DFT 的常用组合步骤。看看 http://fourier.eng.hmc.edu/e101 /lectures/Image_Processing/node6.html 位于页面最底部。
也就是说,可能有更好的方法来比较图像,而不需要比较二维信息数组。例如,PCA(主成分分析,这只是在对图像进行均值中心化后对图像运行 SVD {奇异值分解},尽管我会先看一下维基百科上的文章)会给你一个然后你可以应用一些 L_p 范数来直接比较一维向量,尽管在这种情况下,我会使用类似于 sum(min(a_i/b_i , b_i/a_i))/length(a) 的东西,其中 a 和 b 是从变换中获得的一维向量。
Before you attempt to code up a 2DFT, you should fully understand the math behind it. flolo is correct that you can compute it by first doing a 1D FFT on the rows and columns and then combining the results, but I have no reason to believe the L_inf norm is the best way to convert them to a metric, since it completely skips the usual combining step to create the full 2DFT. Take a look at http://fourier.eng.hmc.edu/e101/lectures/Image_Processing/node6.html at the very bottom of the page.
That said, there may be better ways to compare images that don't require comparing 2D arrays of information. For instance, PCA (Principal Component Analysis, which is just a matter of running SVD {Singular Value Decomposition} on your images after mean-centering them, though I'd take a look at the wikipedia article on it first) will give you a 1D vector which you could then apply some L_p norm to directly to compare, although in this case, i would use something like sum(min(a_i/b_i , b_i/a_i))/length(a), where a and b are the 1D vectors you got from the transform.
有很多不错的网站都提供了对一维值数组进行 fft 的代码。您只需在图像上逐行应用此 fft 即可。然后对结果按列进行 fft 处理。
现在您需要从生成的转换图像中获取一个度量,我的建议是尝试最大范数(L_inf)。即 max_{x,y}{fft2d(imag1)[x,y] - fft2d(imag2)[x,y]}。
There are many good sites with code for a fft on an 1-D array of values. You just apply this fft row by row on your image. And afterwards you do fft columnwise on the results.
Now you need a metric to get from the resulting transformed image, my suggestion would be to try the max-norm (L_inf). That is max_{x,y}{fft2d(imag1)[x,y] - fft2d(imag2)[x,y]}.
如果您只想检查一张图像是否可能是另一张图像的快速编辑(例如库存摄影的 DRM),请检查可能区域内标准化调色板的百分比。如果它们在图像中多个 TEST_REGIONS 中任意一个的 NUMBER_OF_TEST_COLORS 的阈值内匹配,那么您就有了“嫌疑人”...您仍然需要人工来检查嫌疑人。但这是一种快速而肮脏的方法,可以找到许多图像调整器、水平/垂直翻转器、背景颜色变换器、文件格式变换器和其他微妙的变化......当然,将颜色“标准化”为量化调色板本身就是一门艺术。为了实用性,我建议将图像量化为最接近的“网络安全”颜色。
与数学家相比,我是一个蓝领垃圾工,但垃圾工很实用!我使用这种方法在对相似图像进行分组并按颜色应用程序进行搜索方面取得了巨大成功。
If you just want to check if it is likely that one image is a quick edit of another for something like DRM of stock photography then check the percentages of a normalized color palette within probable regions. If they match within an THRESHOLD for a NUMBER_OF_TEST_COLORS in any one of a number of TEST_REGIONS within the image then you have a "suspect"... you still need a human to check the suspects. But this is a quick and dirty way to find many of the image re-sizers, horiz/vert flippers, and background color changers, file format changers, and other subtle variations... of course "normalizing the colors" to a quantized palette is an art unto itself. I would recommend quantizing images into nearest "web safe" colors for practicality.
I'm a blue collar garbage man in comparison to a mathematician, but garbage men are quite practical! I have had good success with this kind of approach in grouping similar images and search by color applications.