当前位置：文江博客话题详情

使用 OpenCV 进行视频稳定

发布于 2024-09-13 20:28:50 字数 181 浏览 5 评论 0原文

我有一个用移动摄像机拍摄的视频，其中包含移动的物体。我想稳定视频，以便所有静止物体在视频源中保持静止。我如何使用OpenCV做到这一点？

例如，如果我有两个图像 prev_frame 和 next_frame，如何转换 next_frame 以使摄像机看起来静止？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枫以 2024-09-20 20:28:50

我可以建议以下解决方案之一：

使用局部高级特征：OpenCV 包含 SURF，因此：对于每一帧，提取 SURF 特征。然后构建特征 Kd-Tree（也在 OpenCV 中），然后匹配每两个连续帧以找到对应特征对。将这些对输入 cvFindHomography 来计算这些帧之间的单应性。根据（组合..）单应性扭曲框架以稳定。据我所知，这是一种非常稳健和复杂的方法，但是 SURF 提取和匹配可能非常慢
如果您预计两帧之间只有微小的移动，例如使用 Harris，您可以尝试使用“不太稳健”的特征来执行上述操作角点检测并在两个帧中构建彼此最接近的角点对，然后将其输入到 cvFindHomography，然后如上所述。可能更快但不太健壮。
如果您将移动限制为平移，您也许可以用更简单的东西替换 cvFindHomography，以获取特征对之间的平移（例如平均）
使用相位相关（参考。 http://en.wikipedia.org/wiki/Phase_correlation)，如果您期望仅在两个帧之间进行转换。 OpenCV 包括 DFT/FFT 和 IFFT，请参阅链接的维基百科文章中的公式和解释。

编辑
为了以防万一，我最好明确提及三点：

基于单应性的方法可能非常精确，因此静止的物体将保持静止。然而，单应性还包括透视畸变和缩放，因此结果可能看起来有点……不常见（或者甚至因某些快速移动而扭曲）。虽然很准确，但这可能在视觉上不太令人愉悦；因此，请将此用于进一步处理或取证等。但你应该尝试一下，对于某些场景/动作来说也可能会非常令人愉悦。
据我所知，至少有几个免费的视频稳定工具使用相位相关。如果您只是想“不晃动”相机，这可能会更好。
这个领域正在进行相当多的研究。您会在一些论文中发现一些更复杂的方法（尽管它们可能需要的不仅仅是 OpenCV）。

I can suggest one of the following solutions:

Using local high level features: OpenCV includes SURF, so: for each frame, extract SURF features. Then build feature Kd-Tree (also in OpenCV), then match each two consecutive frames to find pairs of corresponding features. Feed those pairs into cvFindHomography to compute the homography between those frames. Warp frames according to (combined..) homographies to stabilize. This is, to my knowledge, a very robust and sophisticated approach, however SURF extraction and matching can be quite slow
You can try to do the above with "less robust" features, if you expect only minor movement between two frames, e.g. use Harris corner detection and build pairs of corners closest to each other in both frames, feed to cvFindHomography then as above. Probably faster but less robust.
If you restrict movement to translation, you might be able to replace cvFindHomography with something more...simple, to just get the translation between feature-pairs (e.g. average)
Use phase-correlation (ref. http://en.wikipedia.org/wiki/Phase_correlation), if you expect only translation between two frames. OpenCV includes DFT/FFT and IFFT, see the linked wikipedia article on formulas and explanation.

EDIT
Three remarks I should better mention explicitly, just in case:

The homography based approach is likely very exact, so stationary object will remain stationary. However, homographies include perspective distortion and zoom as well so the result might look a bit..uncommon (or even distorted for some fast movements). Although exact, this might be less visually pleasing; so use this rather for further processing or, like, forensics. But you should try it out, could be super-pleasing for some scenes/movements as well.
To my knowledge, at least several free video-stabilization tools use the phase-correlation. If you just want to "un-shake" the camera, this might be preferable.
There is quite some research going on in this field. You'll find some a lot more sophisticated approaches in some papers (although they likely require more than just OpenCV).

回复收藏 0 原文

终陌 2024-09-20 20:28:50

OpenCV 具有estimateRigidTransform() 和warpAffine() 函数，可以很好地处理此类问题。

它非常简单，如下所示：

Mat M = estimateRigidTransform(frame1,frame2,0)
warpAffine(frame2,output,M,Size(640,480),INTER_NEAREST|WARP_INVERSE_MAP)

现在 output 包含 frame2 的内容，该内容最适合与 frame1 对齐。
对于大位移，M 将是零矩阵，或者可能根本不是矩阵，具体取决于 OpenCV 的版本，因此您必须过滤这些矩阵而不应用它们。我不确定那有多大；也许是框架宽度的一半，也许更多。

estimateRigidTransform 的第三个参数是一个布尔值，告诉它是否也应用任意仿射矩阵或将其限制为平移/旋转/缩放。为了稳定相机的图像，您可能只需要后者。事实上，对于相机图像稳定，您可能还希望通过将返回的矩阵标准化为仅旋转和平移来消除任何缩放。

另外，对于移动摄像机，您可能希望随时间对 M 进行采样并计算平均值。

以下是有关 estimateRigidTransform() 和 warpAffine ()

OpenCV has the functions estimateRigidTransform() and warpAffine() which handle this sort of problem really well.

Its pretty much as simple as this:

Mat M = estimateRigidTransform(frame1,frame2,0)
warpAffine(frame2,output,M,Size(640,480),INTER_NEAREST|WARP_INVERSE_MAP)

Now output contains the contents of frame2 that is best aligned to fit to frame1.
For large shifts, M will be a zero Matrix or it might not be a Matrix at all, depending on the version of OpenCV, so you'd have to filter those and not apply them. I'm not sure how large that is; maybe half the frame width, maybe more.

The third parameter to estimateRigidTransform is a boolean that tells it whether to also apply an arbitrary affine matrix or restrict it to translation/rotation/scaling. For the purposes of stabilizing an image from a camera you probably just want the latter. In fact, for camera image stabilization you might also want to remove any scaling from the returned matrix by normalizing it for only rotation and translation.

Also, for a moving camera, you'd probably want to sample M through time and calculate a mean.

Here are links to more info on estimateRigidTransform(), and warpAffine()

回复收藏 0 原文

家住魔仙堡 2024-09-20 20:28:50

openCV 现在有一个视频稳定类： http://docs.opencv.org/trunk /d5/d50/group__videostab.html

回复收藏 0 原文

陌若浮生 2024-09-20 20:28:50

我从这个答案中过去了。如何稳定网络摄像头视频？

昨天我刚刚做了一些工作（在 Python 中））关于这个主题，主要步骤是：

使用cv2.goodFeaturesToTrack来找到好的角点。
使用cv2.calcOpticalFlowPyrLK来跟踪角点。
使用 cv2.findHomography 计算单应矩阵。
使用cv2.warpPerspective来转换视频帧。

但现在结果不太理想，可能我应该选择SIFT keypoints而不是goodFeatures。

来源：

稳定汽车：

回复收藏 0 原文

阳光的暖冬 2024-09-20 20:28:50

我应该添加以下注释来完成zerm的回答。
如果选择一个固定物体，然后使用 zerm 的方法 (1) 处理该单个物体，这将简化您的问题。
如果您找到一个静止物体并对它应用校正，我认为可以安全地假设其他静止物体也会看起来稳定。

虽然它对于您的棘手问题当然有效，但这种方法会遇到以下问题：

检测和单应性估计有时会因各种原因而失败：遮挡、突然移动、运动模糊、严重的照明差异。您必须寻找方法来处理它。
您的目标对象可能有遮挡，这意味着它的检测将在该帧上失败，您将必须处理遮挡，这本身就是一个完整的研究主题。
根据您的硬件和解决方案的复杂性，您可能会在使用 SURF 获得实时结果时遇到一些困难。您可以尝试 opencv 的 GPU 实现或其他更快的特征检测器，如 ORB、BRIEF 或 FREAK。

回复收藏 0 原文

北陌 2024-09-20 20:28:50

这已经是很好的答案，但它使用了一点旧的算法，我开发了程序来解决类似的问题，所以我添加了额外的答案。

首先，您应该使用 SIFT、SURF 算法等特征提取器从图像中提取特征。就我而言，FAST+ORB 算法是最好的。如果您想了解更多信息，请查看本文
在了解以下功能后图像，你应该找到与图像匹配的特征。有几个匹配器，但 Bruteforce 匹配器还不错。如果 Bruteforce 在你的系统中很慢，你应该使用像 KD-Tree 这样的算法。
最后，您应该得到几何变换矩阵，该矩阵使变换点的误差最小化。在这个过程中可以使用RANSAC算法。
您可以使用 OpenCV 开发所有这些过程，我已经在移动设备中开发了它。查看此存储库

回复收藏 0 原文

蓝梦月影 2024-09-20 20:28:50

这是一个棘手的问题，但我可以提出一个稍微简单的情况。

将 next_frame 移动/旋转任意量
使用背景减法 threshold(abs(prev_frame-next_frame_rotated)) 来查找静态元素。您必须考虑使用什么阈值。
查找 min(template_match(prev_frame_background, next_frame_rotated_background))
记录最接近匹配的移动/旋转并将其应用于 next_frame

随着时间的推移，这对于多个帧来说效果不佳，所以你需要考虑使用后台累加器因此算法寻找的背景随着时间的推移是相似的。

回复收藏 0 原文

紧拥背影 2024-09-20 20:28:50

背景：
我正在做一个研究项目，我试图计算排队的人需要多长时间才能到达柜台。我需要的第一件事是镜头，所以我去了校园并记录了一些排队买票的游客。到目前为止，我还不知道如何计算排队时间以及在录制视频时应该采取哪些预防措施。一天结束时，我发现我录制的所有镜头都是用摇晃的相机录制的。所以此时我首先需要稳定视频，然后才开发其他解决方案来计算排队时间。

使用模板匹配实现视频稳定