Opencv 基于特征点的图像对齐

发布于 2024-06-10 23:48:16 字数 7170 浏览 31 评论 0

二维图像之间的单应变换

图像中的 2D 点 (x,y)(x,y) 可以被表示成 3D 向量的形式 $(x_1,x_2,x_3)$，其中 $x=\frac{x_1}{x_3}$，$y=\frac{x_2}{x_3}$。它被叫做点的齐次表达，位于投影平面 $P^2$ 上。所谓单应就是发生在投影平面 $P^2$ 上的点和线可逆的映射。其它叫法包括射影变换、投影变换和平面投影变换等。

单应变换矩阵是一个 3*3 的矩阵 H。这个变换可以被任意乘上一个非零常数，而不改变变换本身。它虽然具有 9 个元素，但是具有 8 个自由度。这意味这它里面有 8 个未知参数待求。

典型地，可以通过图像之间的特征匹配来估计单应矩阵。

单应和齐次坐标

一个单应矩阵是大小为 3*3 的矩阵$H = \begin{bmatrix} h_{11} & h_{12} & h_{13}\ h_{21} & h_{22} & h_{23}\ h_{31} & h_{32} & h_{33}\end{bmatrix}$，满足给定一个点 $P_1 = \left[ \begin{matrix} x_1 \ y_1 \ w_1 \end{matrix} \right]$ ,矩阵 H 把点$P_1$变换成一个新的点$P_2 = \left[ \begin{matrix} x_2 \ y_2 \ w_2 \end{matrix} \right] = \begin{bmatrix} h_{11} & h_{12} & h_{13}\ h_{21} & h_{22} & h_{23}\ h_{31} & h_{32} & h_{33}\end{bmatrix}\cdot \left[ \begin{matrix} x_1 \ y_1 \ w_1 \end{matrix} \right]$ 。由于他们都是齐次坐标，对应在图像上的两个点分别是$\left[ \begin{matrix} \frac{x_1}{w_1} \ \frac{y_1}{w_1}\end{matrix} \right]$，$\left[ \begin{matrix} \frac{x_2}{w_2} \ \frac{y_2}{w_2}\end{matrix} \right]$ 。

单应的自由度

如果给定一个单位$H={h_{ij}}$，给 H 的每个元素乘上 a，得到的单应$aH$和$H$作用相同，因为新的单应无非把齐次点$P_1$变成了齐次点$aP_2$，$aP_2$和$P_2$在图像上对应的点是相同的。所以一个单应中只有 8 个自由度，一般令$h_{33}=1$来归一化。

求解单应

8 个未知数需要 8 个方程来求解，之所以 4 对点能求解，因为他们一个点能提供两个方程。

假设图像上有两个点 $(x_1,y_1)$，$(x_2,y_2)$，他们的齐次坐标为 $\left[ \begin{matrix} x_1 \ y_1 \ 1 \end{matrix} \right]$，$\left[ \begin{matrix} x_2 \ y_2 \ 1 \end{matrix} \right]$。

带入上述的推导可以得到：

$$
x_2=x_1h_{11}+y_1h_{12}+h_{13}
$$

$$
y_2=x_1h_{21}+y_1h_{22}+h_{23}
$$

$$
1=x_1h_{31}+y_1h_{32}+h_{33}
$$

一般令$h_{33}=1$来归一化，可得到：

$$
x_2=\frac{x_1h_{11}+y_1h_{12}+h_{13}}{x_1h_{31}+y_1h_{32}+1}
$$

$$
y_2=\frac{x_1h_{21}+y_1h_{22}+h_{23}}{x_1h_{31}+y_1h_{32}+1}
$$

把这两个式子重新组织一下，得到等价的矩阵形式：

$$
Au=v
$$

$$
A = \left[ \begin{matrix} x_1 \ y_1 \ 1 \ 0 \ 0 \ 0 \ -x_1x_2 \ -x_2y_1 \ 0 \ 0 \ 0 \ x_1 \ y_1 \ 1 \ -x_1y_2 \ -y_1y_2 \end{matrix} \right]
$$

$$
u=\left[ \begin{matrix} h_{11} \ h_{12} \ h_{13} \ h_{21} \ h_{22} \ h_{23} \ h_{31} \ h_{32} \ h_{33} \end{matrix} \right]^T
$$

$$
v=\left[ \begin{matrix} x_2 \ y_2 \end{matrix} \right]^T
$$

如果有四对不共线匹配点对，这个方程组就能够垒到 8 行，存在唯一解。

如果多于四对点，比如有 n 对点，方程就垒到 2n 行，用最小二乘法或 SVD 分解就可以求解$H$。

由于点对中可能存在不少错误匹配，一般使用 RANSAC 算法剔除错误匹配点对。

Opencv 中单应性矩阵 H 的计算

如果在两幅对应的图像中已知 4 个映射点的坐标，就可以使用 findHomography 函数如下：

C++：Mat cv::findHomography(InputArray srcPoints,InputArray dstPoints,OutputArray mask,int method = 0,double ransacReprojThreshold = 3)
Python：retval,mask=cv.findHomography(srcPoints,dstPoints[,method[,ransacReprojThreshold[,mask[,maxIters[,confidence]]]]])

自动寻找对应点 (corresponding points)

由上文知道知道两对对应点(4 个映射点) 的坐标即可求得单应性矩阵 H。

我们可以使用在 OpenCV 中的几个关键点检测器（例如 SIFT，SURF 和 ORB）。

本文将采用 ORB 关键点检测器，SIFT 和 SURF 已经注册专利。

一个特征点检测器由两个部分组成：

定位器 (Locator)：它可以识别图像上在平移（移位），缩放（缩小增大/缩小）和旋转等图像变换下稳定的点。定位器查找这些点的 x，y 坐标。 ORB 检测器使用的定位器叫做 FAST 。

描述符 (Descriptor)：上述步骤中的定位器仅告诉我们特征点在哪里。特征检测器的第二部分是对点的外观进行编码的描述符，以便我们可以从另幅图中指出同一个特征点。在特征点处评定的描述符只是一个数字数组。理想情况下，两幅图像中的相同物理点应具有相同的描述符。 ORB 使用的是 BRISK 特征描述符的修改版本。

总结：图像对齐思路

读取图像
特征点检测
匹配特征点
计算单应矩阵参数
矫正图像

C++ Code

#include <opencv2/opencv.hpp>
#include "opencv2/xfeatures2d.hpp"
#include "opencv2/features2d.hpp"

using namespace std;
using namespace cv;
using namespace cv::xfeatures2d;

const int MAX_FEATURES = 500;
const float GOOD_MATCH_PERCENT = 0.15f;


void alignImages(Mat &im1, Mat &im2, Mat &im1Reg, Mat &h)

{

  // Convert images to grayscale
  Mat im1Gray, im2Gray;
  cvtColor(im1, im1Gray, CV_BGR2GRAY);
  cvtColor(im2, im2Gray, CV_BGR2GRAY);

  // Variables to store keypoints and descriptors
  std::vector<KeyPoint> keypoints1, keypoints2;
  Mat descriptors1, descriptors2;

  // Detect ORB features and compute descriptors.
  Ptr<Feature2D> orb = ORB::create(MAX_FEATURES);
  orb->detectAndCompute(im1Gray, Mat(), keypoints1, descriptors1);
  orb->detectAndCompute(im2Gray, Mat(), keypoints2, descriptors2);

  // Match features.
  std::vector<DMatch> matches;
  Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
  matcher->match(descriptors1, descriptors2, matches, Mat());

  // Sort matches by score
  std::sort(matches.begin(), matches.end());

  // Remove not so good matches
  const int numGoodMatches = matches.size() * GOOD_MATCH_PERCENT;
  matches.erase(matches.begin()+numGoodMatches, matches.end());


  // Draw top matches
  Mat imMatches;
  drawMatches(im1, keypoints1, im2, keypoints2, matches, imMatches);
  imwrite("matches.jpg", imMatches);


  // Extract location of good matches
  std::vector<Point2f> points1, points2;

  for( size_t i = 0; i < matches.size(); i++ )
  {
    points1.push_back( keypoints1[ matches[i].queryIdx ].pt );
    points2.push_back( keypoints2[ matches[i].trainIdx ].pt );
  }

  // Find homography
  h = findHomography( points1, points2, RANSAC );

  // Use homography to warp image
  warpPerspective(im1, im1Reg, h, im2.size());

}


int main(int argc, char **argv)
{
  // Read reference image
  string refFilename("form.jpg"); 
  cout << "Reading reference image : " << refFilename << endl; 
  Mat imReference = imread(refFilename);


  // Read image to be aligned
  string imFilename("scanned-form.jpg");
  cout << "Reading image to align : " << imFilename << endl; 
  Mat im = imread(imFilename);


  // Registered image will be resotred in imReg. 
  // The estimated homography will be stored in h. 
  Mat imReg, h;

  // Align images
  cout << "Aligning images ..." << endl; 
  alignImages(im, imReference, imReg, h);

  // Write aligned image to disk. 
  string outFilename("aligned.jpg");
  cout << "Saving aligned image : " << outFilename << endl; 
  imwrite(outFilename, imReg);

  // Print estimated homography
  cout << "Estimated homography : \n" << h << endl; 
}