使用关键点特征匹配 +用于拉直文档的单应性 (Aadhaar)

发布于 2025-01-14 21:30:47 字数 3332 浏览 5 评论 0原文

您好，我正在尝试创建一个 OCR，其中模型应该能够读取上传的文档。然而，很多时候，上传的文件是歪斜或倾斜的。我计划根据模板拉直文档和/或调整文档大小。

为了实现这一目标，我打算使用特征映射和单应性。然而，每当我计算关键点和描述符（使用 ORB），并尝试使用强力匹配来匹配它们时，似乎没有一个特征匹配。这是我迄今为止使用的代码及其结果。如果我遗漏了某些东西或以某种不正确的方式做事，有人可以指出我正确的方向吗？

def straighten_image(ORIG_IMG, IMG2):
    # read both the images:
    orig_image = cv2.imread(ORIG_IMG)
    img_input = cv2.imread(IMG2)
    
    orig_gray_scale = cv2.cvtColor(orig_image, cv2.COLOR_BGR2GRAY)
    gray_scale_img = cv2.cvtColor(img_input, cv2.COLOR_BGR2GRAY)
    
    #Detect ORB features and compute descriptors
    MAX_NUM_FEATURES = 100
    orb = cv2.ORB_create(MAX_NUM_FEATURES)
    keypoints1, descriptors1 = orb.detectAndCompute(orig_gray_scale, None)
    keypoints2, descriptors2= orb.detectAndCompute(gray_scale_img, None)
    
    #display image with keypoints
    orig_wid_decriptors = cv2.drawKeypoints(orig_gray_scale, keypoints1, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
    inp_wid_decriptors = cv2.drawKeypoints(img_input, keypoints2, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

    #Match features
    
    matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
    matches = matcher.match(descriptors1, descriptors2, None)
    
    print(type(matches))
    
    #sort matches
#     matches.sort(key=lambda x: x.distance, reverse=False)
    
    
    #Remove not-so-good matches
    numGoodMatches = int(len(matches)*0.1)
    matches = matches[:numGoodMatches]
    
    #Draw Top matches
    im_matches = cv2.drawMatches(orig_gray_scale, keypoints1, gray_scale_img, keypoints2, matches, None)
    
    cv2.imshow("", im_matches)
    cv2.waitKey(0)
    
    #Homography
    points1 = np.zeros((len(matches), 2), dtype = np.float32)
    points2 = np.zeros((len(matches), 2), dtype = np.float32)
    
    for i, match in enumerate(matches):
        points1[i, :] = keypoints1[match.queryIdx].pt
        points2[i, :] = keypoints2[match.trainIdx].pt
        
    #Find homography:
    h, mask = cv2.findHomography(points2, points1, cv2.RANSAC)
    
    #Warp image
    # Use homography to warp image
    height, width = orig_gray_scale.shape
    inp_reg = cv2.warpPerspective(gray_scale_img, h, (width, height), borderValue = 255)
    
    return inp_reg


import cv2
import matplotlib.pyplot as plt
import numpy as np
template = "template_aadhaar.jpg"
test = "test.jpeg"

str_img = straighten_image(template, test)

cv2.imshow("", str_img)
cv2.waitKey(0)

编辑：如果我使用我自己的身份证（完全笔直）作为模板，并尝试对齐倾斜的同一张身份证，它匹配特征并完美地重新对齐倾斜的图像。但是，我需要模型能够根据模板重新对齐任何其他 ID 卡。对于任何 ID，我的意思是细节可能不同，但位置和字体将完全相同。

编辑#2：按照@Olli 的建议，我尝试使用仅包含所有 Aadhaar 卡相同功能的模板。附图片。但特征匹配仍然有点随意。

原文

Hi I'm trying to create an OCR where the model should be able to read an uploaded document. However, lot of times, the documents uploaded are skewed or tilted. I plan to straighten and/or resize the document based on a template.

To achieve this, I intend to use feature mapping and homography. However, whenever I calculate my keypoints and descriptors (using ORB), and try to match them using Brute Force Matching, none of the features seem to match. Here's the code that I've used so far and the results with it. Can someone point me in the right direction if I'm missing something or doing it in a certain incorrect way?

def straighten_image(ORIG_IMG, IMG2):
    # read both the images:
    orig_image = cv2.imread(ORIG_IMG)
    img_input = cv2.imread(IMG2)
    
    orig_gray_scale = cv2.cvtColor(orig_image, cv2.COLOR_BGR2GRAY)
    gray_scale_img = cv2.cvtColor(img_input, cv2.COLOR_BGR2GRAY)
    
    #Detect ORB features and compute descriptors
    MAX_NUM_FEATURES = 100
    orb = cv2.ORB_create(MAX_NUM_FEATURES)
    keypoints1, descriptors1 = orb.detectAndCompute(orig_gray_scale, None)
    keypoints2, descriptors2= orb.detectAndCompute(gray_scale_img, None)
    
    #display image with keypoints
    orig_wid_decriptors = cv2.drawKeypoints(orig_gray_scale, keypoints1, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
    inp_wid_decriptors = cv2.drawKeypoints(img_input, keypoints2, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

    #Match features
    
    matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
    matches = matcher.match(descriptors1, descriptors2, None)
    
    print(type(matches))
    
    #sort matches
#     matches.sort(key=lambda x: x.distance, reverse=False)
    
    
    #Remove not-so-good matches
    numGoodMatches = int(len(matches)*0.1)
    matches = matches[:numGoodMatches]
    
    #Draw Top matches
    im_matches = cv2.drawMatches(orig_gray_scale, keypoints1, gray_scale_img, keypoints2, matches, None)
    
    cv2.imshow("", im_matches)
    cv2.waitKey(0)
    
    #Homography
    points1 = np.zeros((len(matches), 2), dtype = np.float32)
    points2 = np.zeros((len(matches), 2), dtype = np.float32)
    
    for i, match in enumerate(matches):
        points1[i, :] = keypoints1[match.queryIdx].pt
        points2[i, :] = keypoints2[match.trainIdx].pt
        
    #Find homography:
    h, mask = cv2.findHomography(points2, points1, cv2.RANSAC)
    
    #Warp image
    # Use homography to warp image
    height, width = orig_gray_scale.shape
    inp_reg = cv2.warpPerspective(gray_scale_img, h, (width, height), borderValue = 255)
    
    return inp_reg


import cv2
import matplotlib.pyplot as plt
import numpy as np
template = "template_aadhaar.jpg"
test = "test.jpeg"

str_img = straighten_image(template, test)

cv2.imshow("", str_img)
cv2.waitKey(0)

EDIT: If I use my own ID-card (perfectly straight) as the template and try to align the same ID-card that is tilted, it matches the features and re-aligns the tilted image perfectly. However, I need the model to be able to re-align any other ID-card based on the template. By any ID, I mean the details could be different but the location and font would be exactly the same.

EDIT#2: As suggested by @Olli, I tried using a template with only those features that are same for all Aadhaar cards. Image attached. But still the feature matching is a bit arbitrary.

分享到QQ

分享到微博