当前位置：文江博客话题详情

使用 OpenCV 检测一个图像中的对象是否在另一图像中

发布于 2024-12-11 09:56:46 字数 1061 浏览 0 评论 0原文

我有一个包含对象的示例图像，例如下图中的耳环：

https://i.sstatic.net/N5w9a .jpg

然后，我有一大堆候选图像，我需要确定哪一个最有可能包含该对象，例如：

https://i.sstatic.net/xYL90.jpg

所以我需要为每个图像生成一个分数，其中最高分数对应于最有可能包含目标对象的图像。现在，在这种情况下，我需要使用/解决以下条件/约束：

1）我可以从不同角度获取多个样本图像。

2) 样本图像可能与候选图像具有不同的分辨率、角度和距离。

3）有很多候选图像（> 10,000），所以它必须相当快。

4) 我愿意为了速度而牺牲一些精度，所以如果这意味着我们必须搜索前 100 个而不是仅前 10 个，那很好，并且可以手动完成。

5）我可以手动操作样本图像，例如勾勒出我想要检测的物体；由于候选图像太多，无法手动操作。

6）我根本没有 OpenCV 或计算机视觉方面的真正背景，所以我从头开始。

我最初的想法是首先在示例图像中的对象周围绘制一个粗略的轮廓。然后，我可以识别对象中的角点和候选图像中的角点。我可以对每个角周围的像素进行分析，看看它们是否看起来相似，然后根据每个角的最大相似度分数的总和进行排名。我也不确定如何量化相似的像素。我猜只是它们的 RGB 值的欧几里得距离？

问题在于它忽略了对象的中心。在上面的例子中，如果耳环的角都靠近金框，那么就不会考虑耳环内部的红、绿、蓝宝石。我想我可以通过查看所有角点对并通过沿它们之间的线采样一些点来确定相似性来改进这一点。

所以我有几个问题：

A）这种思路总体上有意义还是我遗漏了什么？

B) 我应该使用 OpenCV 中的哪些特定算法进行研究？我知道有多种角点检测算法，但我只需要一种，如果差异都在边缘进行优化，那么我可以选择最快的算法。

C）有任何使用算法的示例代码有助于帮助我理解吗？

我的语言选择是 Python 或 C#。

原文

I have a sample image which contains an object, such as the earrings in the following image:

https://i.sstatic.net/N5w9a.jpg

I then have a large candidate set of images for which I need to determine which one most likely contains the object, e.g.:

https://i.sstatic.net/xYL90.jpg

So I need to produce a score for each image, where the highest score corresponds to the image which most likely contains the target object. Now, in this case, I have the following conditions/constraints to work with/around:

1) I can obtain multiple sample images at different angles.

2) The sample images are likely to be at different resolutions, angles, and distances than the candidate images.

3) There are a LOT of candidate images (> 10,000), so it must be reasonably fast.

4) I'm willing to sacrifice some precision for speed, so if it means we have to search through the top 100 instead of just the top 10, that's fine and can be done manually.

5) I can manipulate the sample images manually, such as outlining the object that I wish to detect; the candidate images cannot be manipulated manually as there are too many.

6) I have no real background in OpenCV or computer vision at all, so I'm starting from scratch here.

My initial thought is to start by drawing a rough outline around the object in the sample image. Then, I could identify corners in the object and corners in the candidate image. I could profile the pixels around each corner to see if they look similar and then rank by the sum of the maximum similarity scores of every corner. I'm also not sure how to quantify similar pixels. I guess just the Euclidean distance of their RGB values?

The problem there is that it kind of ignores the center of the object. In the above examples, if the corners of the earrings are all near the gold frame, then it would not consider the red, green, and blue stones inside the earring. I suppose I could improve this by then looking at all pairs of corners and determining similarity by sampling some points along the line between them.

So I have a few questions:

A) Does this line of thinking make sense in general or is there something I'm missing?

B) Which specific algorithms from OpenCV should I investigate using? I'm aware that there are multiple corner detection algorithms, but I only need one and if the differences are all optimizing on the margins then I'm fine with the fastest.

C) Any example code using the algorithms that would be helpful to aid in my understanding?

My options for languages are either Python or C#.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

∞琼窗梦回ˉ 2024-12-18 09:56:46

幸运的是，OpenCV 的好心人已经为你做到了这一点。检查您的示例文件夹“opencv\samples\cpp\matching_to_many_images.cpp”。编译并使用默认图像尝试一下。

该算法可以很容易地进行调整，使其更快或更精确。

物体识别算法主要分为两部分：关键点检测和关键点检测。描述和对象匹配。对于它们两者，都有许多算法/变体，您可以直接在 OpenCV 中使用。

检测/描述可以通过：SIFT/SURF/ORB/GFTT/STAR/FAST等进行。

为了进行匹配，您可以使用：强力、汉明等。（某些方法特定于给定的检测算法）

开始提示：

裁剪原始图像，以便有趣的对象尽可能多地覆盖图像区域。将其用作训练。
SIFT 是最准确且最惰性的描述符。 FAST 是精度和准确度的良好结合。 GFTT 很旧而且很不可靠。 ORB 是 OPENCV 中新添加的，无论是速度还是精度都非常有前途。
结果取决于另一幅图像中物体的姿势。如果调整大小、旋转、挤压、部分覆盖等，请尝试 SIFT。如果它是一个简单的任务（即它以几乎相同的大小/旋转/等出现，大多数描述符都能很好地处理）
ORB 可能尚未出现在 OpenCV 版本中。尝试从 openCV trunk 下载最新版本并编译 https://code.ros.org/svn/ opencv/trunk

因此，您可以通过反复试验找到最适合您的组合。

有关每个实现的详细信息，您应该阅读原始论文/教程。谷歌学术是一个好的开始

回复收藏 0 原文

a√萤火虫的光℡ 2024-12-18 09:56:46

查看 SURF 功能，它们是 openCV 的一部分。这里的想法是，您有一个算法可以在两个图像中查找“兴趣点”。您还有一个算法用于计算每个兴趣点周围图像块的描述符。通常，该描述符捕获块中边缘方向的分布。然后尝试找到点对应关系，即对于图像 A 中的每个兴趣点，尝试在图像 B 中找到相应的兴趣点。这是通过比较描述符并寻找最接近的匹配来完成的。然后，如果您有一组通过某种几何变换相关的对应关系，那么您就进行了检测。

当然，这是一个非常高层次的解释。细节决定成败，对于那些你应该阅读一些论文。从 David Lowe 的来自尺度不变关键点的独特图像特征开始，然后阅读有关 SURF 的论文。

另外，考虑将此问题移至信号和图像处理堆栈交换

回复收藏 0 原文

杯别 2024-12-18 09:56:46

如果将来有人出现，这里有一个使用 openCV 执行此操作的小示例。它基于 opencv 示例，但是（在我的意见），这更清楚一些，所以我也将其包括在内。

使用 openCV 2.4.4 进行测试

#!/usr/bin/env python

'''
Uses SURF to match two images.
  Finds common features between two images and draws them

Based on the sample code from opencv:
  samples/python2/find_obj.py

USAGE
  find_obj.py <image1> <image2>
'''

import sys

import numpy
import cv2


###############################################################################
# Image Matching
###############################################################################

def match_images(img1, img2, img1_features=None, img2_features=None):
    """Given two images, returns the matches"""
    detector = cv2.SURF(3200)
    matcher = cv2.BFMatcher(cv2.NORM_L2)

    if img1_features is None:
        kp1, desc1 = detector.detectAndCompute(img1, None)
    else:
        kp1, desc1 = img1_features

    if img2_features is None:
        kp2, desc2 = detector.detectAndCompute(img2, None)
    else:
        kp2, desc2 = img2_features

    #print 'img1 - %d features, img2 - %d features' % (len(kp1), len(kp2))

    raw_matches = matcher.knnMatch(desc1, trainDescriptors=desc2, k=2)
    kp_pairs = filter_matches(kp1, kp2, raw_matches)
    return kp_pairs


def filter_matches(kp1, kp2, matches, ratio=0.75):
    """Filters features that are common to both images"""
    mkp1, mkp2 = [], []
    for m in matches:
        if len(m) == 2 and m[0].distance < m[1].distance * ratio:
            m = m[0]
            mkp1.append(kp1[m.queryIdx])
            mkp2.append(kp2[m.trainIdx])
    kp_pairs = zip(mkp1, mkp2)
    return kp_pairs


###############################################################################
# Match Diplaying
###############################################################################

def draw_matches(window_name, kp_pairs, img1, img2):
    """Draws the matches"""
    mkp1, mkp2 = zip(*kp_pairs)

    H = None
    status = None

    if len(kp_pairs) >= 4:
        p1 = numpy.float32([kp.pt for kp in mkp1])
        p2 = numpy.float32([kp.pt for kp in mkp2])
        H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)

    if len(kp_pairs):
        explore_match(window_name, img1, img2, kp_pairs, status, H)


def explore_match(win, img1, img2, kp_pairs, status=None, H=None):
    """Draws lines between the matched features"""
    h1, w1 = img1.shape[:2]
    h2, w2 = img2.shape[:2]
    vis = numpy.zeros((max(h1, h2), w1 + w2), numpy.uint8)
    vis[:h1, :w1] = img1
    vis[:h2, w1:w1 + w2] = img2
    vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)

    if H is not None:
        corners = numpy.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
        reshaped = cv2.perspectiveTransform(corners.reshape(1, -1, 2), H)
        reshaped = reshaped.reshape(-1, 2)
        corners = numpy.int32(reshaped + (w1, 0))
        cv2.polylines(vis, [corners], True, (255, 255, 255))

    if status is None:
        status = numpy.ones(len(kp_pairs), numpy.bool_)
    p1 = numpy.int32([kpp[0].pt for kpp in kp_pairs])
    p2 = numpy.int32([kpp[1].pt for kpp in kp_pairs]) + (w1, 0)

    green = (0, 255, 0)
    red = (0, 0, 255)
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            col = green
            cv2.circle(vis, (x1, y1), 2, col, -1)
            cv2.circle(vis, (x2, y2), 2, col, -1)
        else:
            col = red
            r = 2
            thickness = 3
            cv2.line(vis, (x1 - r, y1 - r), (x1 + r, y1 + r), col, thickness)
            cv2.line(vis, (x1 - r, y1 + r), (x1 + r, y1 - r), col, thickness)
            cv2.line(vis, (x2 - r, y2 - r), (x2 + r, y2 + r), col, thickness)
            cv2.line(vis, (x2 - r, y2 + r), (x2 + r, y2 - r), col, thickness)
    vis0 = vis.copy()
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            cv2.line(vis, (x1, y1), (x2, y2), green)

    cv2.imshow(win, vis)

###############################################################################
# Test Main
###############################################################################

if __name__ == '__main__':
    if len(sys.argv) < 3:
        print "No filenames specified"
        print "USAGE: find_obj.py <image1> <image2>"
        sys.exit(1)

    fn1 = sys.argv[1]
    fn2 = sys.argv[2]

    img1 = cv2.imread(fn1, 0)
    img2 = cv2.imread(fn2, 0)

    if img1 is None:
        print 'Failed to load fn1:', fn1
        sys.exit(1)

    if img2 is None:
        print 'Failed to load fn2:', fn2
        sys.exit(1)

    kp_pairs = match_images(img1, img2)

    if kp_pairs:
        draw_matches('find_obj', kp_pairs, img1, img2)
    else:
        print "No matches found"

    cv2.waitKey()
    cv2.destroyAllWindows()

In case someone comes along in the future, here's a small sample doing this with openCV. It's based on the opencv sample, but (in my opinion), this is a bit clearer, so I'm including it as well.

Tested with openCV 2.4.4

#!/usr/bin/env python

'''
Uses SURF to match two images.
  Finds common features between two images and draws them

Based on the sample code from opencv:
  samples/python2/find_obj.py

USAGE
  find_obj.py <image1> <image2>
'''

import sys

import numpy
import cv2


###############################################################################
# Image Matching
###############################################################################

def match_images(img1, img2, img1_features=None, img2_features=None):
    """Given two images, returns the matches"""
    detector = cv2.SURF(3200)
    matcher = cv2.BFMatcher(cv2.NORM_L2)

    if img1_features is None:
        kp1, desc1 = detector.detectAndCompute(img1, None)
    else:
        kp1, desc1 = img1_features

    if img2_features is None:
        kp2, desc2 = detector.detectAndCompute(img2, None)
    else:
        kp2, desc2 = img2_features

    #print 'img1 - %d features, img2 - %d features' % (len(kp1), len(kp2))

    raw_matches = matcher.knnMatch(desc1, trainDescriptors=desc2, k=2)
    kp_pairs = filter_matches(kp1, kp2, raw_matches)
    return kp_pairs


def filter_matches(kp1, kp2, matches, ratio=0.75):
    """Filters features that are common to both images"""
    mkp1, mkp2 = [], []
    for m in matches:
        if len(m) == 2 and m[0].distance < m[1].distance * ratio:
            m = m[0]
            mkp1.append(kp1[m.queryIdx])
            mkp2.append(kp2[m.trainIdx])
    kp_pairs = zip(mkp1, mkp2)
    return kp_pairs


###############################################################################
# Match Diplaying
###############################################################################

def draw_matches(window_name, kp_pairs, img1, img2):
    """Draws the matches"""
    mkp1, mkp2 = zip(*kp_pairs)

    H = None
    status = None

    if len(kp_pairs) >= 4:
        p1 = numpy.float32([kp.pt for kp in mkp1])
        p2 = numpy.float32([kp.pt for kp in mkp2])
        H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)

    if len(kp_pairs):
        explore_match(window_name, img1, img2, kp_pairs, status, H)


def explore_match(win, img1, img2, kp_pairs, status=None, H=None):
    """Draws lines between the matched features"""
    h1, w1 = img1.shape[:2]
    h2, w2 = img2.shape[:2]
    vis = numpy.zeros((max(h1, h2), w1 + w2), numpy.uint8)
    vis[:h1, :w1] = img1
    vis[:h2, w1:w1 + w2] = img2
    vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)

    if H is not None:
        corners = numpy.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
        reshaped = cv2.perspectiveTransform(corners.reshape(1, -1, 2), H)
        reshaped = reshaped.reshape(-1, 2)
        corners = numpy.int32(reshaped + (w1, 0))
        cv2.polylines(vis, [corners], True, (255, 255, 255))

    if status is None:
        status = numpy.ones(len(kp_pairs), numpy.bool_)
    p1 = numpy.int32([kpp[0].pt for kpp in kp_pairs])
    p2 = numpy.int32([kpp[1].pt for kpp in kp_pairs]) + (w1, 0)

    green = (0, 255, 0)
    red = (0, 0, 255)
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            col = green
            cv2.circle(vis, (x1, y1), 2, col, -1)
            cv2.circle(vis, (x2, y2), 2, col, -1)
        else:
            col = red
            r = 2
            thickness = 3
            cv2.line(vis, (x1 - r, y1 - r), (x1 + r, y1 + r), col, thickness)
            cv2.line(vis, (x1 - r, y1 + r), (x1 + r, y1 - r), col, thickness)
            cv2.line(vis, (x2 - r, y2 - r), (x2 + r, y2 + r), col, thickness)
            cv2.line(vis, (x2 - r, y2 + r), (x2 + r, y2 - r), col, thickness)
    vis0 = vis.copy()
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            cv2.line(vis, (x1, y1), (x2, y2), green)

    cv2.imshow(win, vis)

###############################################################################
# Test Main
###############################################################################

if __name__ == '__main__':
    if len(sys.argv) < 3:
        print "No filenames specified"
        print "USAGE: find_obj.py <image1> <image2>"
        sys.exit(1)

    fn1 = sys.argv[1]
    fn2 = sys.argv[2]

    img1 = cv2.imread(fn1, 0)
    img2 = cv2.imread(fn2, 0)

    if img1 is None:
        print 'Failed to load fn1:', fn1
        sys.exit(1)

    if img2 is None:
        print 'Failed to load fn2:', fn2
        sys.exit(1)

    kp_pairs = match_images(img1, img2)

    if kp_pairs:
        draw_matches('find_obj', kp_pairs, img1, img2)
    else:
        print "No matches found"

    cv2.waitKey()
    cv2.destroyAllWindows()

回复收藏 0 原文