改进了视频缩略图提取
我一直在使用 FFmpeg 查找 h264 视频文件的中间帧,并提取 jpg 缩略图以在流媒体门户上使用。对于每个上传的视频,此操作都会自动完成。
有时,该帧恰好是黑帧或只是语义上不好,即背景或模糊镜头与视频内容没有很好的相关性。
我想知道是否可以使用 openCV 或其他方法/库通过面部识别或帧分析以编程方式找到更好的缩略图。
I have been using FFmpeg to find the middle frame of a h264 video file, and extract the jpg thumbnail for use on a streaming portal. This is done automatically for each uploaded video.
Sometimes the frame happens to be a black frame or just semantically bad i.e. a background or blurry shot which doesn't relate well to the video content.
I wonder if I can use openCV or some other method/library to programmatically find better thumbnails through facial recognition or frame analysis.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我自己也遇到过这个问题,并想出了一个非常粗略但简单的算法来确保我的缩略图更“有趣”。如何?
为什么这有效?因为单调“无聊”图像(如全黑屏幕)的 jpeg 文件会比包含许多对象和颜色的图像压缩成小得多的文件。
它并不完美,但却是一个可行的 80/20 解决方案。 (用 20% 的工作解决 80% 的问题。)编写实际分析图像本身的代码将需要更多的工作。
I've run into that problem myself and came up with a very crude-yet-simple algorithm to ensure my thumbnails were more "interesting". How?
Why does this work? Because jpeg files of a monotone 'boring' image, like an all black screen, compress into a much smaller files than an image with many objects and colors in it.
It's not perfect, but is a viable 80/20 solution. (Solves 80% of the problem with 20% of the work.) Coding something that actually analyzes the image itself is going to be considerably more work.
Libavfilter 有一个缩略图过滤器,旨在从一系列帧中挑选最具代表性的帧。不确定它是如何工作的,但这里是文档 http://ffmpeg.org/libavfilter.html#thumbnail
Libavfilter has got a thumbnail filter, which is meant to pick the most representative frame from a series of frames. Not sure how it works, but heres the docs http://ffmpeg.org/libavfilter.html#thumbnail
如果有人需要两个衬垫(使用 ffmpeg 和 imagemagick):(
这会从视频中选取最多 20 帧,并使用 gt(scene) 选取过渡时刻。它使用 ffmpeg 制作 120 像素宽的 png,然后使用 imagemagick 制作gif(因为 ffmpeg gif 是出了名的难看...)如果电影中没有发生任何事情,它可能会失败,但是你不应该称它为电影 - 应该吗?
In case anyone needs a two liner (using ffmpeg and imagemagick):
(this picks a max of 20 frames from the video and uses gt(scene) to pick transition moments. It uses ffmpeg to make 120pixel wide pngs and then imagemagick to make a gif (because the ffmpeg gifs are notoriously ugly...) It might fail if nothing happens in the movie, but then you shouldn't call it a movie - should you?