从原始视频中获取运动向量
我想知道是否有任何关于如何获取原始视频流中宏块的运动向量的好的(并且免费提供的)文本。这经常用于视频压缩,尽管我的应用不是视频编码。
OSS 编解码器中提供了执行此操作的代码,但通过阅读代码来理解该方法有点困难。
我的实际目标是确定相机在 2D 投影空间中的运动,假设相机仅改变其方向(而不是位置)。我想做的是将帧划分为宏块,获取它们的运动向量,并通过平均这些向量来获取相机运动。
我想 OpenCV 可以帮助解决这个问题,但它在我的目标平台上不可用。
I'd like to know if there is any good (and freely available) text, on how to obtain motion vectors of macro blocks in raw video stream. This is often used in video compression, although my application of it is not video encoding.
Code that does this is available in OSS codecs, but understanding the method by reading the code is kinda hard.
My actual goal is to determine camera motion in 2D projection space, assuming the camera is only changing it's orientation (NOT the position). What I'd like to do is divide the frames into macro blocks, obtain their motion vectors, and get the camera motion by averaging those vectors.
I guess OpenCV could help with this problem, but it's not available on my target platform.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
通常的方法是简单的强力:将宏块与参考帧中的每个宏块进行比较,并使用给出最小残余误差的宏块。代码变得复杂主要是因为这通常是基于 mv 的压缩中最慢的部分,因此他们投入了大量的工作来优化它,通常以牺牲任何甚至接近可读性为代价。
特别是对于实时压缩,有些通过(例如)将搜索限制在原始位置+/-某个最大增量来稍微减少工作量。这通常可以获得相当多的压缩速度,以换取相当小的压缩损失。
The usual way is simple brute force: Compare a macro block to each macro block from the reference frame and use the one that gives the smallest residual error. The code gets complex primarily because this is usually the slowest part of mv-based compression, so they put a lot of work into optimizing it, often at the expense of anything even approaching readability.
Especially for real-time compression, some reduce the workload a bit by (for example) restricting the search to the original position +/- some maximum delta. This can often gain quite a bit of compression speed in exchange for a fairly small loss of compression.
如果您仅假设相机运动,我怀疑对连续图像的 FFT 进行分析是可能的。对于幅度变化不大的频率,相位信息将指示相机运动。不确定这是否有助于相机旋转,但可能可以计算横向和垂直运动。由于新信息出现在一个边缘而在另一边缘消失,将会出现困难,我不确定这会造成多大的伤害。这是对你的问题的推测性思考,所以我没有证据或参考:-)
If you assume only camera motion, I suspect there is something possible with analysis of the FFT of successive images. For frequencies whose amplitudes have not changed much, the phase information will indicate the camera motion. Not sure if this will help with camera rotation, but lateral and vertical motion can probably be computed. There will be difficulties due to new information appearing on one edge and disappearing on the other and I'm not sure how much that will hurt. This is speculative thinking in response to your question, so I have no proof or references :-)
听起来您正在做一个非常有限的 SLAM 项目?
布里斯托大学有很多阅读材料,帝国理工学院,牛津大学 - 您可能会发现他们在帧与帧之间查找和匹配候选特征的方法利息——比绝对差值的简单求和更加稳健。
Sounds like you're doing a very limited SLAM project?
Lots of reading matter at Bristol University, Imperial College, Oxford University for example - you might find their approaches to finding and matching candidate features from frame to frame of interest - much more robust than simple sums of absolute differences.
对于这种类型的最低级算法,您要查找的术语是光流 该类最简单的算法之一是 Lucas Kanade 算法。
<一href="https://crypted.google.com/url?sa=t&source=web&cd=2&ved=0CCEQFjAB&url=http://www.dcc.fc.up.pt/~mcoimbra/讲座/VCS_0708/VCS%252 02008%2520-%252010%2520-%2520Optical%2520Flow.pdf&ei=QQ-nTYb3O9GbOonblNI J&usg=AFQjCNEx4s1UYHFIe_FNAQH5hB5cREvcQg&sig2=XOmp-MwiVp3I0C-RLRiOjg" rel="nofollow">这是一个非常好的概述演示,它应该为您提供很多关于满足您需要的算法的想法
For the most low-level algorithms of this type the term you are looking for is optical flow and one of the easiest algorithms of that class is the Lucas Kanade algorithm.
This is a pretty good overview presentation that should give you plenty of ideas for an algorithm that does what you need