视频场景检测实现
我正在寻找视频场景检测算法的实现。 用于实现的任何编程语言都是可接受的。 我发现这个 实现 但它对小变化且不准确。
I am looking for a video scene detection algorithm implementation.
Any programming language used for the implementation is acceptable.
I found this implementation but it is very sensitive to small changes and inaccurate.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
另一种选择是托管视频理解嵌入模型。如果将您的视频分成 N 个片段,然后嵌入捕获动作、上下文等的每个片段,您可以执行语义查询,例如“接球”或“破冰”。
mixpeek 有一个模型 vuse-generic-v1: https://learn.mixpeek.com/ vuse-v1-release/
这是一个示例索引管道:
并在对视频嵌入建立索引后进行查询:
embedding = mixpeek.embed("打破僵局", "vuse-generic-v1")
another option is a managed video understanding embedding model. if split your video into N segments, then embed each segment capturing motion, context etc you can do semantic queries like "catching a ball" or "breaking the ice".
mixpeek has a model, vuse-generic-v1: https://learn.mixpeek.com/vuse-v1-release/
here's an example indexing pipeline:
and to query once you have the video embeddings indexed:
embedding = mixpeek.embed("breaking the ice", "vuse-generic-v1")
最后,我通过比较 2 个连续帧做了一个简单的实现:
阈值因视频而异......但我得到了可以接受的结果。
Finally I did an simple implementation, by comparing 2 consecutive frames:
The threshold varies from a video to another ... but I got acceptable results.