当前位置：文江博客话题详情

如何从监控摄像头录制的视频中读取时间？

发布于 2024-10-08 19:57:45 字数 557 浏览 8 评论 0原文

我有一个问题，我必须从监控摄像头录制的视频中读取录制时间。

时间显示在视频的左上角区域。下面是显示时间的区域屏幕截图的链接。此外，数字颜色（白色/黑色）在视频播放期间不断变化。

替代文本 http://i55.tinypic.com/2j5gca8.png

请引导我接近的方向这个问题。我是一名 Java 程序员，因此更喜欢通过 Java 的方法。

编辑： 感谢unhillbilly的评论。我查看了 Ron Cemer OCR 库，其性能远低于我们的要求。

由于 ocr 性能低于预期，我计划使用所有数字的屏幕抓取来构建字符集，并使用一些图像/像素比较库将帧时间与字符集进行比较，这将显示概率结果比较后。

所以我一直在寻找一个好的图像比较库（我可以使用可以使用命令行运行的非java库）。另外，关于上述方法的任何建议都会非常有帮助。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

等待圉鍢 2024-10-15 19:57:46

Java OCR 将非常适合您的情况（这里是 Ron Cemer）。您所需要做的就是删除背景图像，或者使其始终小于 50% 白色，这样当图像转换为单色时，白色字符将变为白色，背景将变为黑色。

在字体上训练 JavaOCR，从图像中提取矩形区域，删除背景，然后就可以开始运行了。

我建议使用一种算法，该算法查看 r、g、b 并将所有内容设置为黑色，其中 r、g、b 的值不完全相同。这将只留下完美的灰色阴影的像素。由于图像是彩色的而数字是单色的，因此会留下数字和一些灰尘。

JavaOCR 希望在白色背景上看到黑色字符，因此完成上述操作后，您还需要反转单色图像（白色 = 黑色，反之亦然）。然后通过 JavaOCR 库运行它，向它传递您希望它识别的所有字符的参考样本，您的问题应该（至少大部分）得到解决。

回复收藏 0 原文

晨曦÷微暖 2024-10-15 19:57:46

尝试使用 Google 的 Tesseract，是一对 JNI 包装器可用的。确保阅读常见问题解答以仅提取数字。

回复收藏 0 原文

甜尕妞 2024-10-15 19:57:45

您似乎不需要在这里使用完整的 OCR。
我认为数字始终位于图像中的相同位置。您只期望每个已知位置上有数字 0-9（黑色或白色）。
每个位置与每个数字的简单模板匹配（每种颜色的 10 个数字有 20 个模板）非常快（实时），并且应该为您提供非常准确的结果。

回复收藏 0 原文

春夜浅 2024-10-15 19:57:45

源文件的格式是什么（vhs、dvd、剧照）？时间戳可能已编码在数据中。

更新更多细节

虽然我完全理解拥有自动化端到端流程的愿望（特别是如果您销售此应用程序而不是创建内部工具），但它会让某人手动输入每个视频的开始时间（即使有数百个视频）比花费数周的编码让它自动工作会更有效。

我会做什么（失败了一个简单、实现速度非常快、超准确的 OCR 解决方案，我认为该解决方案不存在）：

创建几个数据库表，例如

video           video_group
-------         -----------
id              id
filename        title
start_time      date_created
group_id        date_modified
date_created    date_deleted
date_modified
date_deleted

video_group 可能包含

id| title
-----------
1 | Unassigned
2 | 711 Mockingbird @ 75
3 | Kroger storage room

video 将通过导入脚本预先填充视频文件名。最初为所有内容分配 group_id 为 1（未分配）

创建一个简单的 Winforms 或 WPF 应用程序（请原谅我的 ASCII 艺术）：

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
|  Group: [=========]\/ [New group...]                            |
|                                                                 |
|  File:  [=========]\/                                           |
|                                                                 |
|  Preview                                                        |
|  |--------------------------------------| [Next Video]          |
|  | (first frame of selected video here) | [Prev]                |
|  |                                      |                       |
|  |                                      |                       |
|  |                                      |                       |
|  |--------------------------------------|                       |
|  Start Time                                                     |
|  [(enter start time value here as displayed on preview frame)]  |
|                                                                 |
|  [Update]                                                       |
-------------------------------------------------------------------

用户（任何人都可以这样做 - 秘书、看门人，甚至是最近的计算机科学毕业生）。他们所要做的就是从预览帧中读取时间，将其输入到“开始时间”字段中，然后单击“更新”或“下一步”以更新数据库并继续下一个。将组选择从一个视频保留到下一个视频，除非用户更改它。

假设用户需要 30 秒来阅读、输入并单击下一步，他们可以在一小时内完成 100-150 个视频（更实际的估计为 75 个视频）。而且，实习生比开发人员时间便宜很多。

如果您确实有“数百个”视频，那么这样做仍然比使用 OCR 更快。如果 OCR 在大部分情况下都有效，您很可能需要有人手动检查所有内容以查看结果是否正确。这就引出了一个问题，为什么要使用 OCR 呢？

What format is the source in (vhs, dvd, stills)? It's possible that the time stamp is encoded in the data.

Update with more detail

While I completely understand the desire to have an automated end-to-end process (especially if you're selling this app as opposed to creating an in-house tool), it'd be more efficient to have someone manually enter the start time for each video (even if there are hundreds of them ) then to spend weeks of coding getting this to work automatically.

What I'd do (failing a simple, very-fast-to-implement, super-accurate OCR solution which I don't believe exists):

Create a couple of database tables, like

video           video_group
-------         -----------
id              id
filename        title
start_time      date_created
group_id        date_modified
date_created    date_deleted
date_modified
date_deleted

video_group might contain

id| title
-----------
1 | Unassigned
2 | 711 Mockingbird @ 75
3 | Kroger storage room

video would be prepopulated with the video filenames by an import script. Initially assign everything a group_id of 1 (Unassigned)

Create a simple Winforms or WPF app (pardon my ASCII art):

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
|  Group: [=========]\/ [New group...]                            |
|                                                                 |
|  File:  [=========]\/                                           |
|                                                                 |
|  Preview                                                        |
|  |--------------------------------------| [Next Video]          |
|  | (first frame of selected video here) | [Prev]                |
|  |                                      |                       |
|  |                                      |                       |
|  |                                      |                       |
|  |--------------------------------------|                       |
|  Start Time                                                     |
|  [(enter start time value here as displayed on preview frame)]  |
|                                                                 |
|  [Update]                                                       |
-------------------------------------------------------------------

A user (anybody could do this - secretary, janitor, even a recent CS graduate). All they have to do is read the time from the preview frame, type it into the Start Time field, and Click "update" or "Next" to update the database and move on to the next one. Keep the Group selection from one video to the next unless the user changes it.

Assuming it takes the user 30 seconds to read, type and click next, They could complete 100-150 videos in an hour (Call it 75 for a more realistic estimate). And, interns are a lot cheaper than developer time.

If you really have "hundreds" of videos, it'll still be faster to do it this way than to screw around with OCR. If the OCR works for the most part, you'll most likely need to have someone manually inspect everything to see if the results are correct. which begs the question, why bother with the OCR?

回复收藏 0 原文

~没有更多了~