使用立体相机到物体的距离
有没有办法使用立体相机计算到特定物体的距离? 是否有方程式或其他东西可以使用视差或角度来获得距离?
Is there a way to calculate the distance to specific object using stereo camera?
Is there an equation or something to get distance using disparity or angle?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
注意:这里描述的所有内容都可以在《学习 OpenCV》一书中有关相机校准和立体视觉的章节中找到。您应该阅读这些章节以更好地理解以下步骤。
一种不需要您亲自测量所有相机内部参数和外部参数的方法是使用 openCVs 校准函数。相机内部参数(镜头畸变/倾斜等)可以使用 cv::calibrateCamera 计算,而外部参数(左右相机之间的关系)可以使用 cv::stereoCalibrate 计算。这些函数采用像素坐标中的多个点,并尝试将它们映射到现实世界的对象坐标。 CV 有一种巧妙的方法来获取这些点,打印出黑白棋盘并使用 cv::findChessboardCorners/cv::cornerSubPix 函数来提取它们。大约 10-15 个棋盘图像对就可以了。
校准函数计算出的矩阵可以保存到光盘中,这样您就不必在每次启动应用程序时重复此过程。您可以在此处获得一些简洁的矩阵,这些矩阵允许您创建校正贴图 (cv::stereoRectify/cv::initUn DistortRectifyMap),稍后可以使用 cv::remap 将其应用于图像。您还会得到一个名为 Q 的简洁矩阵,它是一个深度视差矩阵。
校正图像的原因是,一旦一对图像的处理完成(假设您的校准正确),一个图像中的每个像素/对象都可以在另一个图像的同一行上找到图像。
您可以通过多种方式进行操作,具体取决于您要在图像中寻找哪种特征。一种方法是使用 CV 的立体对应功能,例如立体块匹配或半全局块匹配。这将为您提供整个图像的视差图,可以使用 Q 矩阵 (cv::reprojectImageTo3D) 将其转换为 3D 点。
这样做的缺点是,除非图像中存在大量纹理信息,否则 CV 并不真正擅长构建密集视差图(您会在其中找不到给定像素的正确视差图) ,所以另一种方法是自己找到想要匹配的点。假设您在左图像中找到 x=40,y=110 的特征/对象,在右图像中找到 x=22 的特征/对象(由于图像经过校正,它们应该具有相同的 y 值)。视差计算公式为 d = 40 - 22 = 18。
构造一个 cv::Point3f(x,y,d),在我们的例子中为 (40,110,18)。以同样的方式找到其他有趣的点,然后将所有点发送到 cv::perspectiveTransform (以 Q 矩阵作为变换矩阵,本质上这个函数是 cv::reprojectImageTo3D 但对于稀疏视差图),输出将是以左摄像机为中心的 XYZ 坐标系。
NOTE: Everything described here can be found in the Learning OpenCV book in the chapters on camera calibration and stereo vision. You should read these chapters to get a better understanding of the steps below.
One approach that do not require you to measure all the camera intrinsics and extrinsics yourself is to use openCVs calibration functions. Camera intrinsics (lens distortion/skew etc) can be calculated with cv::calibrateCamera, while the extrinsics (relation between left and right camera) can be calculated with cv::stereoCalibrate. These functions take a number of points in pixel coordinates and tries to map them to real world object coordinates. CV has a neat way to get such points, print out a black-and-white chessboard and use the cv::findChessboardCorners/cv::cornerSubPix functions to extract them. Around 10-15 image pairs of chessboards should do.
The matrices calculated by the calibration functions can be saved to disc so you don't have to repeat this process every time you start your application. You get some neat matrices here that allow you to create a rectification map (cv::stereoRectify/cv::initUndistortRectifyMap) that can later be applied to your images using cv::remap. You also get a neat matrix called Q, which is a disparity-to-depth matrix.
The reason to rectify your images is that once the process is complete for a pair of images (assuming your calibration is correct), every pixel/object in one image can be found on the same row in the other image.
There are a few ways you can go from here, depending on what kind of features you are looking for in the image. One way is to use CVs stereo correspondence functions, such as Stereo Block Matching or Semi Global Block Matching. This will give you a disparity map for the entire image which can be transformed to 3D points using the Q matrix (cv::reprojectImageTo3D).
The downfall of this is that unless there is much texture information in the image, CV isn't really very good at building a dense disparity map (you will get gaps in it where it couldn't find the correct disparity for a given pixel), so another approach is to find the points you want to match yourself. Say you find the feature/object in x=40,y=110 in the left image and x=22 in the right image (since the images are rectified, they should have the same y-value). The disparity is calculated as d = 40 - 22 = 18.
Construct a cv::Point3f(x,y,d), in our case (40,110,18). Find other interesting points the same way, then send all of the points to cv::perspectiveTransform (with the Q matrix as the transformation matrix, essentially this function is cv::reprojectImageTo3D but for sparse disparity maps) and the output will be points in an XYZ-coordinate system with the left camera at the center.
我仍在研究它,所以我不会发布完整的源代码。但我会给你一个概念性的解决方案。
您将需要以下数据作为输入(对于两个摄像头):
您可以测量最后一个自己动手,将相机放在一张纸上,画两条线并测量这些线之间的角度。
相机不必以任何方式对齐,您只需要能够在两个相机中看到您的物体即可。
现在计算从每个相机到您的物体的矢量。您有来自每个摄像机的对象的 (X,Y) 像素坐标,并且需要计算矢量 (X,Y,Z)。请注意,在简单的情况下,对象正好位于相机中间,解决方案将很简单(camera.PointOfInterest - camera.Position)。
一旦两个向量都指向目标,这些向量定义的线应该在理想世界中交叉在一个点上。在现实世界中,由于测量误差小和相机分辨率有限,它们不会。因此,使用下面的链接来计算两条线之间的距离向量。
两条线之间的距离
在该链接中:P0 是您的第一个凸轮位置,Q0 是您的第二个凸轮位置,u 和v 是从相机位置开始并指向目标的向量。
你对实际距离不感兴趣,他们想计算。您需要向量 Wc - 我们可以假设该对象位于 Wc 的中间。一旦您知道了物体在 3D 空间中的位置,您也可以获得您想要的任何距离。
我很快就会发布完整的源代码。
I am still working on it, so I will not post entire source code yet. But I will give you a conceptual solution.
You will need the following data as input (for both cameras):
You can measure the last one yourself, by placing the camera on a piece of paper and drawing two lines and measuring an angle between these lines.
Cameras do not have to be aligned in any way, you only need to be able to see your object in both cameras.
Now calculate a vector from each camera to your object. You have (X,Y) pixel coordinates of the object from each camera, and you need to calculate a vector (X,Y,Z). Note that in the simple case, where the object is seen right in the middle of the camera, the solution would simply be (camera.PointOfInterest - camera.Position).
Once you have both vectors pointing at your target, lines defined by these vectors should cross in one point in ideal world. In real world they would not because of small measurement errors and limited resolution of cameras. So use the link below to calculate the distance vector between two lines.
Distance between two lines
In that link: P0 is your first cam position, Q0 is your second cam position and u and v are vectors starting at camera position and pointing at your target.
You are not interested in the actual distance, they want to calculate. You need the vector Wc - we can assume that the object is in the middle of Wc. Once you have the position of your object in 3D space you also get whatever distance you like.
I will post the entire source code soon.
我有用于检测人脸的源代码,不仅返回深度,还返回以左相机(或右相机,我不记得)为原点的真实世界坐标。它改编自“学习OpenCV”的源代码,并参考了一些网站来使其工作。结果通常是相当准确的。
I have the source code for detecting human face and returns not only depth but also real world coordinates with left camera (or right camera, I couldn't remember) being origin. It is adapted from source code from "Learning OpenCV" and refer to some websites to get it working. The result is generally quite accurate.