当前位置：文江博客话题详情

视频场景检测实现

发布于 2024-10-13 22:24:51 字数 161 浏览 8 评论 0原文

我正在寻找视频场景检测算法的实现。用于实现的任何编程语言都是可接受的。我发现这个实现但它对小变化且不准确。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

缪败 2024-10-20 22:24:51

另一种选择是托管视频理解嵌入模型。如果将您的视频分成 N 个片段，然后嵌入捕获动作、上下文等的每个片段，您可以执行语义查询，例如“接球”或“破冰”。

mixpeek 有一个模型 vuse-generic-v1： https://learn.mixpeek.com/ vuse-v1-release/

这是一个示例索引管道：

obj = {
    "embedding": mixpeek.embed.video(chunk, "mixpeek/vuse-generic-v1"),
    "tags": mixpeek.extract.video(chunk, ""),
    "description": mixpeek.extract.video(chunk, ""),
    "file_url": file_url,
    "parent_id": FileTools.uuid(),
    "metadata": {
        "time_start": chunk.start_time,
        "time_end": chunk.end_time,
        "duration": chunk.duration,
        "fps": chunk.fps,
        "video_codec": chunk.video_codec,
        "audio_codec": chunk.audio_codec
    }
}

并在对视频嵌入建立索引后进行查询：
embedding = mixpeek.embed("打破僵局", "vuse-generic-v1")

another option is a managed video understanding embedding model. if split your video into N segments, then embed each segment capturing motion, context etc you can do semantic queries like "catching a ball" or "breaking the ice".

mixpeek has a model, vuse-generic-v1: https://learn.mixpeek.com/vuse-v1-release/

here's an example indexing pipeline:

obj = {
    "embedding": mixpeek.embed.video(chunk, "mixpeek/vuse-generic-v1"),
    "tags": mixpeek.extract.video(chunk, ""),
    "description": mixpeek.extract.video(chunk, ""),
    "file_url": file_url,
    "parent_id": FileTools.uuid(),
    "metadata": {
        "time_start": chunk.start_time,
        "time_end": chunk.end_time,
        "duration": chunk.duration,
        "fps": chunk.fps,
        "video_codec": chunk.video_codec,
        "audio_codec": chunk.audio_codec
    }
}

and to query once you have the video embeddings indexed:
embedding = mixpeek.embed("breaking the ice", "vuse-generic-v1")

回复收藏 0 原文