重建对应点的 3D 位置

发布于 2024-10-19 20:57:26 字数 974 浏览 3 评论 0原文

我正在开展一个项目，我想重建 3D 位置我从相机图像中提取的特征点。这个想法是：

进行相机记录（灰度信息，VGA尺寸：640 x 480）
提取相机帧中的特征点（我为此使用SIFT）
将帧[k-1]中的特征与帧[中的特征相对应k]（我打算为此使用 RANSAC，稍后详细介绍...）
计算/估计这些特征之间的一些相对距离信息点（这将在某些 (x,y,z) 坐标系中）

我在许多论文中读到 RANSAC 是一种用于重建，最终结果是某种点云。我想成为能够做到这一点。不过我也遇到了一些困难，希望大家能指正可以帮助我解决这些问题。

第一个障碍是我不太明白如何使用 RANSAC 执行此点对应。我理解RANSAC的概念作为一个模型拟合工具，我只是不明白它如何用于做对应解决。

第二个障碍是，假设我有通信信息，如何获取所有这些点之间的某种距离信息。我读过透视投影可以用来解决这个问题，反过来人们应该尝试估计基本矩阵。然后做一些数学魔术就能得到点云。重点是，我不明白基本矩阵中的实际值是什么意思是。我知道它给出了 2 的位置之间的数学关系摄像机（或者在我的例子中，摄像机移动的视频中的 2 帧），以及它利用了对极几何。但除此之外，我就是不知道基本矩阵实际上意味着什么。这个 3x3 矩阵是如何捕获的一个相机相对于另一个相机的 6DOF？另外我认为我提到的“数学魔法”是某种矩阵乘法，但我还没有找到任何信息来源来解释我的意思它的作用以及配方是什么。

因此，我的问题是：你们中有人能指出我正确的方向吗？我一直在挖掘到目前为止我读过的论文的参考文献，但这些也给了我“我们使用 RANSAC 算法解决这个问题”-line，我越来越感觉我是看向错误的方向。对这些事情有一些很好的解释吗，也许用外行的话和/或有一些插图？简而言之：我应该在哪里寻找或者在哪里可以找到这个难以捉摸的部分信息？

提前致谢， Xilconic

PS：检查了维基百科，但这对我没有多大帮助。还听了 “基本矩阵之歌”，也是同样的故事。

原文

I'm working on a project where I would like to reconstruct the 3D locations of
feature points I've extracted from my camera images. The idea is to:

Make a camera recording (Greyscale information, VGA size: 640 x 480)
Extract feature points in the camera frames (I'm using SIFT for this)
Correspond features from frame[k-1] with features from frame[k] (I intend to
use RANSAC for this, more on that later...)
Calculate/estimate some relative distance information between these feature
points (this would be in some (x,y,z) coordinate system)

I've read in many papers that RANSAC is an algorithm that is used in
reconstruction, with the end result being some kind of point cloud. I want to be
able to do just that. However, I've ran into a few snags, and I hope you guys
can help me out with these.

The first snag is that I do not really understand how I would be able to use
RANSAC to perform this point correspondence. I understand the concept of RANSAC
being a model-fitting tool, I just don't see how it could be used for doing
correspondence solving.

The second snag is, assuming I have my correspondence information, how to get
some kind of distance information between all these points. I've read that
perspective projection could be used to solve this, and in turn one should try
to estimate the Fundamental Matrix. Then do some math magic to be able to get
the point cloud.
Point is, I don't understand what the actual values in a Fundamental Matrix
mean. I know it's gives a mathematical relation between the position of 2
cameras (or in my case, 2 frames in a video there the camera is moving), and
that it exploits epipolar geometry. But besides this, I just don't have a clue
what the Fundamental Matrix actually entails. How is this 3x3 matrix capturing
the 6DOF of 1 camera with respect to another?
Also I think the 'math magic' I referred to are some kind of matrix
multiplications, but I haven't found any informational source to explain me what
it does and what the formulation is.

Therefore, my question is:
Could anyone of you point me into the right direction? I've been digging through
the references of the papers I've read so far, but these also give me the "we
solve this using the RANSAC algorithm"-line and I'm getting more the feeling I'm
looking in the wrong direction.
Is there some nice explanation of these things, perhaps in laymen's terms and/or
with some illustrations?
In short: where should I be looking or where can I find this elusive piece of
information?

Thanks in advance,
Xilconic

PS: Checked wikipedia, but it's not helping me much. Also listened to the
'Fundamental Matrix Song', and it's the same story.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

叶落知秋 2024-10-26 20:57:26

为此写了我的论文，在我的论文中也使用了RANSAC算法。

这个主题的内容远不止这里几段所能涵盖的内容。考虑获取这本优秀的书多视图几何。

Snag 1

即使存在大量异常值， RANSAC 也会找到一个模型，在本例中为基本矩阵F。在这种情况下，一些点对应候选者还很遥远。这是一个异常值。基本上，您只需从随机抽取的点中不断拟合 F 矩阵即可。最终你会发现一些点集共同创建了一个一致的模型。这些是内点。它们现在可用于更准确地估计模型 (F)。

我的论文中有一个简单的例子，其中有一个线拟合示例可以帮助您入门，并且对应用于对应问题的 RANSAC 进行了易于理解的解释。

障碍 2

关于 F 矩阵最重要的是它将一幅图像中的点映射到另一幅图像中的线：

Fx = < strong>l'，其中 x 是一幅图像中的点，l' 是另一幅图像中的一条线。

F 矩阵有 9 个元素，但必须具有秩 2 并且尺度并不重要，因此它只有 7 个自由度。对于F矩阵的元素没有简单的解释。

使用点对应x <->如果您知道相机的内部参数（例如焦距），则可以提取所描绘点的 x' 和 F 世界 3D 坐标 X 。

请注意，当使用连续的电影帧时，摄像机通常移动很少，并且可能很难计算基本矩阵。不过，它是可以解决的。我建议研究 Marc Pollefeys 的作品

回复收藏 0 原文

缘字诀 2024-10-26 20:57:26

查看基本原理上维基百科条目中的第一个公式矩阵：

在此处输入图像描述

这是您尝试使用 RANSAC 求解的“模型”。您有两个 3xn (n>=7) 矩阵 x 和 x'代表两个图像中所有相应的 x,y - x',y' 点（第三个坐标始终是数字 1）。还有一个未知的 3x3 矩阵 F，您想要找出其值。维基百科条目中的 RANSAC 伪代码算法是一个很好的解释。

现在，基本矩阵是什么？
将图像中的点视为连接相机位置和 3D 空间中该点的 3D 线。这条线向两个方向延伸至无穷大。如果您使用不同的相机查看该线上的 3D 点，那么在该相机的图像中您会看到一条线正好穿过该点。图像中的点到 3D 线的变换（实际上是投影）只是一个矩阵运算。将 3D 线投影到 2D 图像也是矩阵运算。 F 在一个矩阵中捕获这两种矩阵运算。 F 也可用于确定

也许这有一点帮助？否则，我从哈特利和齐瑟曼。