从 2 张图像进行 3D 重建,无需有关相机的信息

发布于 2024-12-29 03:38:52 字数 10077 浏览 1 评论 0原文

我是这个领域的新手,我正在尝试在 2D 图像中的 3D 中建模一个简单的场景,但我没有任何有关相机的信息。我知道有3个选项

  • 我有两张图像,并且我知道我从 XML 加载的相机型号(内部),例如 loadXMLFromFile() => stereoRectify() =>; reprojectImageTo3D()

  • 我没有它们,但我可以校准我的相机=> stereoCalibrate() =>; stereoRectify() =>; reprojectImageTo3D()

  • 我无法校准相机(这是我的情况,因为我没有拍摄两张图像的相机,那么我需要在两张图片上找到成对的关键点例如,使用 SURF、SIFT 处理图像(我实际上可以使用任何斑点检测器),然后计算这些关键点的描述符,然后根据它们的描述符匹配图像右侧和图像左侧的关键点,然后从中找到基本矩阵进行处理。更难,就像这样:

    1. 检测关键点(SURF、SIFT)=>
    2. 提取描述符(SURF、SIFT)=>
    3. 比较和匹配描述符(BruteForce、基于 Flann 的方法)=>
    4. 从这些对中找到基本垫(findFundamentalMat())=>
    5. stereoRectifyUncaliblated() =>;
    6. reprojectImageTo3D()

我正在使用最后一种方法和我的问题是:

1)是吗?

2)如果可以的话,我对最后一步stereoRectifyUncaliberated() =>有疑问reprojectImageTo3D()reprojectImageTo3D() 函数的签名为:

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )

cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)

参数:

  • disparity – 输入单通道 8 位无符号、16 位有符号、32 位有符号或 32 位浮点点视差图像。
  • _3dImage – 输出与视差大小相同的 3 通道浮点图像。 _3dImage(x,y) 的每个元素都包含根据视差图计算出的点 (x,y) 的 3D 坐标。
  • Q – 4x4 透视变换矩阵,可以通过 stereoRectify() 获得。
  • handleMissingValues – 指示函数是否应处理缺失值(即未计算差异的点)。如果handleMissingValues=true,则与异常值相对应的具有最小视差的像素(请参阅StereoBM::operator())将转换为具有非常大的 Z 值的 3D 点(当前设置为 10000)。
  • ddepth – 可选的输出数组深度。如果为 -1,输出图像将具有 CV_32F 深度。 ddepth 还可以设置为 CV_16SCV_32S 或 `CV_32F'。

如何获得Q矩阵?是否可以用FH1H2或其他方式获得Q矩阵?

3)是否有另一种方法可以在不校准相机的情况下获得xyz坐标?

我的代码是:

#include <opencv2/core/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <stdio.h>
#include <iostream>
#include <vector>
#include <conio.h>
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/cvaux.h>


using namespace cv;
using namespace std;

int main(int argc, char *argv[]){

    // Read the images
    Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
    Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

    // check
    if (!imgLeft.data || !imgRight.data)
            return 0;

    // 1] find pair keypoints on both images (SURF, SIFT):::::::::::::::::::::::::::::

    // vector of keypoints
    std::vector<cv::KeyPoint> keypointsLeft;
    std::vector<cv::KeyPoint> keypointsRight;

    // Construct the SURF feature detector object
    cv::SiftFeatureDetector sift(
            0.01, // feature threshold
            10); // threshold to reduce
                // sensitivity to lines
                // Detect the SURF features

    // Detection of the SIFT features
    sift.detect(imgLeft,keypointsLeft);
    sift.detect(imgRight,keypointsRight);

    std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl;
    std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl;

    // 2] compute descriptors of these keypoints (SURF,SIFT) ::::::::::::::::::::::::::

    // Construction of the SURF descriptor extractor
    cv::SurfDescriptorExtractor surfDesc;

    // Extraction of the SURF descriptors
    cv::Mat descriptorsLeft, descriptorsRight;
    surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft);
    surfDesc.compute(imgRight,keypointsRight,descriptorsRight);

    std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl;

    // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches)

    // Construction of the matcher
    cv::BruteForceMatcher<cv::L2<float> > matcher;

    // Match the two image descriptors
    std::vector<cv::DMatch> matches;
    matcher.match(descriptorsLeft,descriptorsRight, matches);

    std::cout << "Number of matched points: " << matches.size() << std::endl;


    // 4] find the fundamental mat ::::::::::::::::::::::::::::::::::::::::::::::::::::

    // Convert 1 vector of keypoints into
    // 2 vectors of Point2f for compute F matrix
    // with cv::findFundamentalMat() function
    std::vector<int> pointIndexesLeft;
    std::vector<int> pointIndexesRight;
    for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) {

         // Get the indexes of the selected matched keypoints
         pointIndexesLeft.push_back(it->queryIdx);
         pointIndexesRight.push_back(it->trainIdx);
    }

    // Convert keypoints into Point2f
    std::vector<cv::Point2f> selPointsLeft, selPointsRight;
    cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft);
    cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight);

    /* check by drawing the points
    std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin();
    while (it!=selPointsLeft.end()) {

            // draw a circle at each corner location
            cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    }

    it= selPointsRight.begin();
    while (it!=selPointsRight.end()) {

            // draw a circle at each corner location
            cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    } */

    // Compute F matrix from n>=8 matches
    cv::Mat fundemental= cv::findFundamentalMat(
            cv::Mat(selPointsLeft), // points in first image
            cv::Mat(selPointsRight), // points in second image
            CV_FM_RANSAC);       // 8-point method

    std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl;

    /* draw the left points corresponding epipolar lines in right image
    std::vector<cv::Vec3f> linesLeft;
    cv::computeCorrespondEpilines(
            cv::Mat(selPointsLeft), // image points
            1,                      // in image 1 (can also be 2)
            fundemental,            // F matrix
            linesLeft);             // vector of epipolar lines

    // for all epipolar lines
    for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255));
    }

    // draw the left points corresponding epipolar lines in left image
    std::vector<cv::Vec3f> linesRight;
    cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight);
    for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255));
    }

    // Display the images with points and epipolar lines
    cv::namedWindow("Right Image Epilines");
    cv::imshow("Right Image Epilines",imgRight);
    cv::namedWindow("Left Image Epilines");
    cv::imshow("Left Image Epilines",imgLeft);
    */

    // 5] stereoRectifyUncalibrated()::::::::::::::::::::::::::::::::::::::::::::::::::

    //H1, H2 – The output rectification homography matrices for the first and for the second images.
    cv::Mat H1(4,4, imgRight.type());
    cv::Mat H2(4,4, imgRight.type());
    cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2);


    // create the image in which we will save our disparities
    Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S );
    Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 );

    // Call the constructor for StereoBM
    int ndisparities = 16*5;      // < Range of disparity >
    int SADWindowSize = 5;        // < Size of the block window > Must be odd. Is the 
                                  // size of averaging window used to match pixel  
                                  // blocks(larger values mean better robustness to
                                  // noise, but yield blurry disparity maps)

    StereoBM sbm( StereoBM::BASIC_PRESET,
        ndisparities,
        SADWindowSize );

    // Calculate the disparity image
    sbm( imgLeft, imgRight, imgDisparity16S, CV_16S );

    // Check its extreme values
    double minVal; double maxVal;

    minMaxLoc( imgDisparity16S, &minVal, &maxVal );

    printf("Min disp: %f Max value: %f \n", minVal, maxVal);

    // Display it as a CV_8UC1 image
    imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal));

    namedWindow( "windowDisparity", CV_WINDOW_NORMAL );
    imshow( "windowDisparity", imgDisparity8U );


    // 6] reprojectImageTo3D() :::::::::::::::::::::::::::::::::::::::::::::::::::::

    //Mat xyz;
    //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true);

    //How can I get the Q matrix? Is possibile to obtain the Q matrix with 
    //F, H1 and H2 or in another way?
    //Is there another way for obtain the xyz coordinates?

    cv::waitKey();
    return 0;
}

I'm new in this field and I'm trying to model a simple scene in 3d out of 2d images and I dont have any info about cameras. I know that there are 3 options:

  • I have two images and I know the model of my camera (intrisics) that I loaded from a XML for instance loadXMLFromFile() => stereoRectify() => reprojectImageTo3D()

  • I don't have them but I can calibrate my camera => stereoCalibrate() => stereoRectify() => reprojectImageTo3D()

  • I can't calibrate the camera (it is my case, because I don't have the camera that has taken the 2 images, then I need to find pair keypoints on both images with SURF, SIFT for instance (I can use any blob detector actually), then compute descriptors of these keypoints, then match keypoints from image right and image left according to their descriptors, and then find the fundamental matrix from them. The processing is much harder and would be like this:

    1. detect keypoints (SURF, SIFT) =>
    2. extract descriptors (SURF,SIFT) =>
    3. compare and match descriptors (BruteForce, Flann based approaches) =>
    4. find fundamental mat (findFundamentalMat()) from these pairs =>
    5. stereoRectifyUncalibrated() =>
    6. reprojectImageTo3D()

I'm using the last approach and my questions are:

1) Is it right?

2) if it's ok, I have a doubt about the last step stereoRectifyUncalibrated() => reprojectImageTo3D(). The signature of reprojectImageTo3D() function is:

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )

cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)

Parameters:

  • disparity – Input single-channel 8-bit unsigned, 16-bit signed, 32-bit signed or 32-bit floating-point disparity image.
  • _3dImage – Output 3-channel floating-point image of the same size as disparity. Each element of _3dImage(x,y) contains 3D coordinates of the point (x,y) computed from the disparity map.
  • Q – 4x4 perspective transformation matrix that can be obtained with stereoRectify().
  • handleMissingValues – Indicates, whether the function should handle missing values (i.e. points where the disparity was not computed). If handleMissingValues=true, then pixels with the minimal disparity that corresponds to the outliers (see StereoBM::operator()) are transformed to 3D points with a very large Z value (currently set to 10000).
  • ddepth – The optional output array depth. If it is -1, the output image will have CV_32F depth. ddepth can also be set to CV_16S, CV_32S or `CV_32F'.

How can I get the Q matrix? Is possible to obtain the Q matrix with F, H1 and H2 or in another way?

3) Is there another way for obtain the xyz coordinates without calibrating the cameras?

My code is:

#include <opencv2/core/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <stdio.h>
#include <iostream>
#include <vector>
#include <conio.h>
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/cvaux.h>


using namespace cv;
using namespace std;

int main(int argc, char *argv[]){

    // Read the images
    Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
    Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

    // check
    if (!imgLeft.data || !imgRight.data)
            return 0;

    // 1] find pair keypoints on both images (SURF, SIFT):::::::::::::::::::::::::::::

    // vector of keypoints
    std::vector<cv::KeyPoint> keypointsLeft;
    std::vector<cv::KeyPoint> keypointsRight;

    // Construct the SURF feature detector object
    cv::SiftFeatureDetector sift(
            0.01, // feature threshold
            10); // threshold to reduce
                // sensitivity to lines
                // Detect the SURF features

    // Detection of the SIFT features
    sift.detect(imgLeft,keypointsLeft);
    sift.detect(imgRight,keypointsRight);

    std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl;
    std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl;

    // 2] compute descriptors of these keypoints (SURF,SIFT) ::::::::::::::::::::::::::

    // Construction of the SURF descriptor extractor
    cv::SurfDescriptorExtractor surfDesc;

    // Extraction of the SURF descriptors
    cv::Mat descriptorsLeft, descriptorsRight;
    surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft);
    surfDesc.compute(imgRight,keypointsRight,descriptorsRight);

    std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl;

    // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches)

    // Construction of the matcher
    cv::BruteForceMatcher<cv::L2<float> > matcher;

    // Match the two image descriptors
    std::vector<cv::DMatch> matches;
    matcher.match(descriptorsLeft,descriptorsRight, matches);

    std::cout << "Number of matched points: " << matches.size() << std::endl;


    // 4] find the fundamental mat ::::::::::::::::::::::::::::::::::::::::::::::::::::

    // Convert 1 vector of keypoints into
    // 2 vectors of Point2f for compute F matrix
    // with cv::findFundamentalMat() function
    std::vector<int> pointIndexesLeft;
    std::vector<int> pointIndexesRight;
    for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) {

         // Get the indexes of the selected matched keypoints
         pointIndexesLeft.push_back(it->queryIdx);
         pointIndexesRight.push_back(it->trainIdx);
    }

    // Convert keypoints into Point2f
    std::vector<cv::Point2f> selPointsLeft, selPointsRight;
    cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft);
    cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight);

    /* check by drawing the points
    std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin();
    while (it!=selPointsLeft.end()) {

            // draw a circle at each corner location
            cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    }

    it= selPointsRight.begin();
    while (it!=selPointsRight.end()) {

            // draw a circle at each corner location
            cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    } */

    // Compute F matrix from n>=8 matches
    cv::Mat fundemental= cv::findFundamentalMat(
            cv::Mat(selPointsLeft), // points in first image
            cv::Mat(selPointsRight), // points in second image
            CV_FM_RANSAC);       // 8-point method

    std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl;

    /* draw the left points corresponding epipolar lines in right image
    std::vector<cv::Vec3f> linesLeft;
    cv::computeCorrespondEpilines(
            cv::Mat(selPointsLeft), // image points
            1,                      // in image 1 (can also be 2)
            fundemental,            // F matrix
            linesLeft);             // vector of epipolar lines

    // for all epipolar lines
    for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255));
    }

    // draw the left points corresponding epipolar lines in left image
    std::vector<cv::Vec3f> linesRight;
    cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight);
    for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255));
    }

    // Display the images with points and epipolar lines
    cv::namedWindow("Right Image Epilines");
    cv::imshow("Right Image Epilines",imgRight);
    cv::namedWindow("Left Image Epilines");
    cv::imshow("Left Image Epilines",imgLeft);
    */

    // 5] stereoRectifyUncalibrated()::::::::::::::::::::::::::::::::::::::::::::::::::

    //H1, H2 – The output rectification homography matrices for the first and for the second images.
    cv::Mat H1(4,4, imgRight.type());
    cv::Mat H2(4,4, imgRight.type());
    cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2);


    // create the image in which we will save our disparities
    Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S );
    Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 );

    // Call the constructor for StereoBM
    int ndisparities = 16*5;      // < Range of disparity >
    int SADWindowSize = 5;        // < Size of the block window > Must be odd. Is the 
                                  // size of averaging window used to match pixel  
                                  // blocks(larger values mean better robustness to
                                  // noise, but yield blurry disparity maps)

    StereoBM sbm( StereoBM::BASIC_PRESET,
        ndisparities,
        SADWindowSize );

    // Calculate the disparity image
    sbm( imgLeft, imgRight, imgDisparity16S, CV_16S );

    // Check its extreme values
    double minVal; double maxVal;

    minMaxLoc( imgDisparity16S, &minVal, &maxVal );

    printf("Min disp: %f Max value: %f \n", minVal, maxVal);

    // Display it as a CV_8UC1 image
    imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal));

    namedWindow( "windowDisparity", CV_WINDOW_NORMAL );
    imshow( "windowDisparity", imgDisparity8U );


    // 6] reprojectImageTo3D() :::::::::::::::::::::::::::::::::::::::::::::::::::::

    //Mat xyz;
    //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true);

    //How can I get the Q matrix? Is possibile to obtain the Q matrix with 
    //F, H1 and H2 or in another way?
    //Is there another way for obtain the xyz coordinates?

    cv::waitKey();
    return 0;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

微凉 2025-01-05 03:38:52

StereoRectifyUncaliblated 仅计算平面透视变换,而不是对象空间中的校正变换。有必要将此平面变换转换为物体空间变换以提取 Q 矩阵,并且我认为它需要一些相机校准参数(例如相机内在函数)。该主题可能正在进行一些研究课题。

您可能添加了一些步骤来估计相机内在参数,并提取相机的相对方向,以使您的流程正常工作。我认为,如果没有使用主动照明方法,相机校准参数对于提取场景的正确 3D 结构至关重要。

此外,还需要基于束块调整的解决方案来将所有估计值细化为更准确的值。

StereoRectifyUncalibrated calculates simply planar perspective transformation not rectification transformation in object space. It is necessary to convert this planar transformation to object space transformation to extract Q matrice, and i think some of the camera calibration parameters are required for it( like camera intrinsics ). There may have some research topics ongoing with this subject.

You may have add some steps for estimating camera intrinsics, and extracting relative orientation of cameras to make your flow work right. I think camera calibration parameters are vital for extracting proper 3d structure of the scene, if there is no active lighting method is used.

Also bundle block adjustment based solutions are required for refining all estimated values to more accurate values.

岁月蹉跎了容颜 2025-01-05 03:38:52
  1. 这个过程对我来说看起来不错。

  2. 据我所知,对于基于图像的3D建模,相机是显式校准或隐式校准的。您不想明确校准相机。无论如何你都会利用这些东西。匹配对应点对绝对是一种常用的方法。

  1. the procedure looks OK to me .

  2. as far as I know, regarding Image based 3D modelling, cameras are explicitly calibrated or implicitly calibrated. you don't want to explicitly calibrating the camera. you will make use of those things anyway. matching corresponding point pairs are definitely a heavily used approach.

2025-01-05 03:38:52

我认为你需要使用 StereoRectify 来校正你的图像并获得 Q。
该函数需要两个参数(R 和 T)来表示两个相机之间的旋转和平移。
因此您可以使用solvePnP 计算参数。该函数需要某些物体的3D真实坐标和图像中的2D点及其对应点

I think you need to use StereoRectify to rectify your images and get Q.
This function needs two parameters (R and T) the rotation and translation between two cameras.
So you can compute the parameters using solvePnP. This function needs some 3d real coordinates of the certain object and 2d points in images and their corresponding points

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文