通用网络摄像头校准
我正在建立一个网站,该网站使用计算机视觉技术做很酷的事情,用户使用网络摄像头实时录制和上传视频。为此,我需要相机固有参数和畸变参数。我试图找出根据用户上传的视频计算这些的最佳方法。我们无法对用户可能上传的视频做出任何假设 - 但合理的假设是视频中可能存在人类。我仍处于初始阶段,但我有兴趣了解其他人如何解决这个问题。
具体来说,以下是我希望小组中有经验的人可能评论的问题:
- 有哪些算法、库和技术可用于提取市场上任何通用网络摄像头的内在参数和失真参数? [我说“提取”而不是“校准”,以包括内部参数只是方法调用而无需校准的情况]。
- 一般来说,您观察到市场上现有网络摄像头的固有参数和畸变参数有多大差异?您是否使用单个内在参数和畸变参数来近似它们,或者您采用了什么方法?
- 在这些场景中可以采用哪些相机自校准方法(如果有)?是否有任何可用的开源或商业库可能会有所帮助?
- 如果我们的目标是使用用户录制和上传的视频来校准网络摄像头,那么参数中的哪些假设 [例如 fx==fy 或无失真参数] 是有意义的并且对您来说听起来合理?
- 所有相机的固有参数和畸变参数的合理近似是否有意义?验证特定网络摄像头的特定固有参数和失真参数的合理程度是什么?
- 还有其他需要考虑的问题吗?
I am building a website that does cool things using computer vision techniques, with videos live recorded and uploaded by users using their webcam. For this, I need camera intrinsic and distortion parameters. I am trying to figure out what would be the best way to compute these given the user uploaded videos. We can make no assumptions about what videos user might upload - but a reasonable assumption is that a human might be present in the video. I am still in the initial stages of this, but I am interested in knowing how others have solved this problem.
To be specific, below are the questions that I would appreciate someone experienced in the group might comment upon:
- What algorithms, libraries and techniques are available to extract intrinsic and distortion parameters of any generic webcam available in the market? [I say "extract" and not "calibrate" to include cases where intrinsic parameters are just a method call away with no calibration necessary].
- In general, how much variance have you observed in the intrinsic and distortion parameters in the webcams available in the market? Did you approximate them with a single intrinsic and distortion parameters or what approach did you follow?
- What camera self-calibration methods, if any, could be employed in these scenarios? Are there any opensource or commercial libraries available which might be of some help?
- If we aim to calibrate the webcams using the videos user record and upload, what assumptions in the parameters [like fx==fy or no distortion params] makes sense and sounds reasonable to you?
- Would a reasonable approximation of intrinsic and distortion params for all the cameras make sense? What would be a reasonable approach to validate how good particular intrinsic and distortion parameters are for a specific webcam?
- Are there any other issues that need to be considered?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
有时我就是带来坏消息的人:)我现在也是。
对于你的几乎所有观点,明确的答案是“否”、“无”、“不”等等。仅针对最后一点,对于其他问题,答案不是否定,而是一长串:)。
实际上,没有棋盘和一些特定约束的相机标定几乎是不可能的。
最接近无假设校准的实现是在 OpenCV 的拼接模块中。然而,它并不完美,并且不适用于随机视频。尝试一下。
Sometimes I am the one who comes with the bad news :) So do I now.
For almost all your points there the clear answer is No, None, Not, and so on. Only for the last point, with the other issues, the answer is not a no, but a long list :).
Actually, camera calibration without a chessboard and some specific constraints is almost impossible.
The closest implementation to a no-assumptions calibration is found in the stitching module in OpenCV. Hovewer, it is not perfect, and it's not working on random videos. Give it a try.
有著名的相机校准工具箱,一个很好的提取内在的Matlab实现和外部参数。
不仅网络摄像头之间存在差异,而且:
如果您限制自己不对网络摄像头做出任何假设,那么我认为这是一个非常困难的问题视频。如果您不使用已知的东西,例如相机校准工具箱中的棋盘格,那么校准和评估都会很困难。
There is the famous Camera Calibration Toolbox, a good Matlab implementation of extracting intrinsic and extrinsic parameters.
There is a variance not only amongst webcams, but also of:
I think that this is a really hard problem, if you restrict yourself to making no assumptions regarding the video. Both the calibration and the evaluation is hard if you don't use something known - such as checker board in Camera Calibration Toolbox.
许多算法,包括当前在 opencv 中使用的算法,都要求能够检测已知点(例如,棋盘中的角点)。您必须要求您的用户拍摄这种已知模式的照片,这破坏了随机视频的概念。我没有解决方案,但您可能需要考虑要求用户录制结构场景的视频(没有特定的图案或对象)并使用以下描述的算法:
“使用低阶纹理的镜头畸变进行相机校准”
http://ieeexplore.ieee.org/xpls/abs_all.jsp ?arnumber=5995548&tag=1
不过我自己还没有尝试过。
Many algorithms, including the currently used in opencv requires that known points can be detected (e.g corners in a chess board). You would have to require that your users took pictures of this known patterns, which ruin the concept of random videos. I dont have a solution to this but you might want to consider requiring users to record videos of structures scenes(no specific patterns or objects) and use the algorithm described in:
"Camera calibration with lens distortion from low-rank textures"
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995548&tag=1
Haven't tried it myself though.