检测较大图像中图像的位置
如何检测较大图像中图像的位置?我有一个未经修改的图像副本。然后将该图像更改为任意分辨率,并随机放置在任意尺寸的大得多的图像中。不对生成的图像进行其他转换。 Python 代码是理想的,并且可能需要 libgd。如果您知道解决此问题的好方法,您将获得+1。
How do you detect the location of an image within a larger image? I have an unmodified copy of the image. This image is then changed to an arbitrary resolution and placed randomly within a much larger image which is of an arbitrary size. No other transformations are conducted on the resulting image. Python code would be ideal, and it would probably require libgd. If you know of a good approach to this problem you'll get a +1.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
有一个快速但肮脏的解决方案,只需在目标图像上滑动一个窗口并计算每个位置的相似度,然后选择相似度最高的位置。然后将相似度与阈值进行比较,如果分数高于阈值,则得出结论该图像在那里,这就是位置;如果分数低于阈值,则图像不存在。
作为相似性度量,您可以使用归一化相关性或平方差和(也称为 L2 范数)。正如人们提到的,这不会处理规模变化。因此,您还可以多次重新缩放原始图像,并对每个缩放版本重复上述过程。根据输入图像的大小和可能的比例范围,这可能足够好,并且很容易实现。
正确的解决方案是使用仿射不变量。尝试查找“宽基线立体匹配”,人们在这种情况下看待这个问题。使用的方法通常是这样的:
原始图像的预处理
在此阶段结束时,您将拥有一组描述符。
测试(使用新的测试图像)。
There is a quick and dirty solution, and that's simply sliding a window over the target image and computing some measure of similarity at each location, then picking the location with the highest similarity. Then you compare the similarity to a threshold, if the score is above the threshold, you conclude the image is there and that's the location; if the score is below the threshold, then the image isn't there.
As a similarity measure, you can use normalized correlation or sum of squared differences (aka L2 norm). As people mentioned, this will not deal with scale changes. So you also rescale your original image multiple times and repeat the process above with each scaled version. Depending on the size of your input image and the range of possible scales, this may be good enough, and it's easy to implement.
A proper solution is to use affine invariants. Try looking up "wide-baseline stereo matching", people looked at that problem in that context. The methods that are used are generally something like this:
Preprocessing of the original image
At the end of this stage, you will have a set of descriptors.
Testing (with the new test image).
您可能需要交叉相关。 (自相关是将信号与其自身相关联;互相关是将两个不同的信号相关联。)
相关性对您的作用,不仅仅是检查精确匹配,它会告诉您最佳匹配在哪里,以及它们有多好。另一方面是,对于二维图片,它类似于 O(N^3),并且它不是那么简单的算法。但一旦你让它发挥作用,它就会变得神奇。
编辑:啊,您指定了任意调整大小。这将打破任何基于相关性的算法。抱歉,您现在超出了我的经验,因此不会让我删除此答案。
You probably want cross-correlation. (Autocorrelation is correlating a signal with itself; cross correlating is correlating two different signals.)
What correlation does for you, over simply checking for exact matches, is that it will tell you where the best matches are, and how good they are. Flip side is that, for a 2-D picture, it's something like O(N^3), and it's not that simple an algorithm. But it's magic once you get it to work.
EDIT: Aargh, you specified an arbitrary resize. That's going to break any correlation-based algorithm. Sorry, you're outside my experience now and SO won't let me delete this answer.
http://en.wikipedia.org/wiki/Autocorrelation 是我的第一直觉。
http://en.wikipedia.org/wiki/Autocorrelation is my first instinct.
看看尺度不变特征变换;有许多不同的风格,可能或多或少适合您正在使用的图像类型。
Take a look at Scale-Invariant Feature Transforms; there are many different flavors that may be more or less tailored to the type of images you happen to be working with.