将旋转位图与拼贴图像匹配

发布于 2024-08-30 23:13:02 字数 244 浏览 6 评论 0原文

我的问题是我有一张详细街道地图的图像。在此地图上，可以存在以任意角度旋转（可能调整大小）的特定标志小图像（例如交通灯图标）。我在位图中有这个小图像。如果在大拼贴图像中存在该位图的副本、旋转并可能调整大小，是否有任何算法或技术可以用来定位该位图？

这类似于增强现实和定位标记图像的问题，但我的只是 2D，没有透视失真。

编辑：我想要匹配的拼贴图像中的小位图及其副本的大小大致相同，最大大小差异可能为 30%。旋转是纯二维的，没有剪切或任何扭曲。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

此刻的回忆 2024-09-06 23:13:02

这是一个非常棘手的问题。

第一：旋转/调整大小的分辨率是多少？如果您有足够的像素来避免锯齿效应，那么您可能没问题，但如果标志的一种或另一种表示非常小（即，它在拼贴画中很小或在您拥有的样本镜头中很小），则旋转到任意角度可能会很糟糕。

另外，你确定没有剪切或其他类型的效果吗？我假设纯粹的 2D 旋转，其中旋转轴穿过相机的中心（即，停止标志只是旋转的八边形，而不是剪切的八边形）。

如果您有耐心和示例数据，您可以尝试的一件事是实现 Viola 和 Jones ' 人脸匹配算法，但是针对标志。基本上，您需要一堆训练数据，其中您已经从背景/不感兴趣的像素中屏蔽了您感兴趣的像素。然后，该算法从训练数据（“示例”）中随机选择像素，并为每个示例计算几百到几千个统计数据（“特征”）。特征可以是从红色通道中的当前像素强度到蓝色通道中 5x5 邻域的总强度的任何值。然后，为每个像素构建一个直方图，并尝试查找直方图上前景像素与背景像素分离的特征（即，前景全部位于直方图的左侧，背景全部位于右侧）。然后，您选择最适合该工作的功能，并运行它们以在拼贴画中找到标志。

这是我的一位朋友的论文研究的简要总结。这种问题很难轻易解决，而且很容易做出不好的解决方案。

如果您只有一个标志和一张拼贴画并且只想有一种解决方案，那么您基本上可以将标志与拼贴画进行卷积。对每个图像进行FFT，用零填充较小的图像，使其大小与较大的，然后进行逐点乘法。然后，对结果执行逆 fft。您应该看到拼贴画中标志的位置出现尖峰，具体取决于旋转和缩放的严重程度（如果您认为它们非常不同，那么您可能需要尝试各种不同的缩放和旋转技术）。

第二种方法很容易在 matlab 中完成；否则，您将需要像 fftw 这样的库来实现它。

This is a very tricky question here.

First: What are the resolutions for rotation/resizing? If you have sufficient pixels to avoid aliasing effects, then you might be ok, but if one or the other representation of the sign is very small (ie, it's small in the collage or small in the sample shot you have), rotations to an arbitrary angle could be bad.

Also, are you sure you don't have shearing or other kinds of effects? I'm assuming a purely 2D rotation, where the axis of rotation runs through the center of the camera (ie, a stop sign will just be an octagon, rotated, not a sheared octagon).

One thing you can try, if you have the patience and the sample data, is to implement Viola and Jones' face matching algorithm, but for the sign. Basically, you need a bunch of training data, where you have masked out the pixels that are interesting to you from the background/pixels that are not. Then, the algorithm is to randomly select pixels from that training data ('examples') and for each example, calculate a few hundred to a few thousand statistics ('features'). A feature can be anything from the current pixel intensity in the red channel to the summed intensity of a 5x5 neighborhood in the blue channel. Then, you build a histogram for each pixel and try to find features that have foreground pixels separated from background pixels on the histogram (ie, foreground is all on the left of the histogram, background the right). You then choose the best features for the job, and run them to find the sign in the collage.

That is a brief summary of a friend of mine's dissertation research. This kind of problem is hard to solve easily, and easy to make a bad solution for.

If you just have one sign and one collage and only want to have one solution, you can basically convolve the sign with the collage. Take the FFT of each one, pad the smaller image with zeroes so that it's the same size as the larger, then do a point-by-point multiplication. Then, perform an inverse fft on the result. You should see a spike in the location of the sign in the collage, depending on the severity of rotation and scaling (if you believe that they are very different, then you might need to experiment with a variety of different scaling and rotation techniques).

This second approach is easily done in matlab; otherwise, you'll need a library like the fftw to pull it off.

回复收藏 0 原文

~没有更多了~