照片光学字符识别程序

发布于 2024-10-14 18:50:50 字数 294 浏览 2 评论 0原文

我需要在 Matlab(或任何其他可以做到这一点的语言)中开发一个光学字符识别程序,以便能够提取这张照片上的读数。

该程序必须能够上传尽可能多的图片文件,因为我需要处理大约 40000 张图片。

此任务的总体目标是记录照片中所示的特定燃气表的日内燃气读数。目前设置的网络摄像头被编程为每分钟拍摄读数,因此 OCR 程序将有助于获得历史的日内气体读数数据。

哪个是执行此操作的最佳软件?是否有可用的在线资源?

I need to develop an optical character recognition program in Matlab (or any other language that can do this) to be able to extract the reading on this photograph.

The program must be able to upload as many picture files as possible since I have around 40000 pictures that I need to work through.

The general aim of this task is to record intraday gas readings from the specific gas meter shown in the photograph. The is a webcam currently setup that is programmed to photgraph the readings every minute and so the OCR program would help in then having historic intraday gas reading data.

Which is the best software to do this in and are there any online sources that are available for this??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

熟人话多 2024-10-21 18:50:50

我将基本识别步骤分解如下:

  1. 定位图像中的仪表显示
  2. 隔离并清理数字
  3. 计算特征
  4. 使用您使用历史示例训练的模型对每个数字进行分类

假设特定位置的相机不移动,步骤 1 只需要执行一次。第 2 步将包括增强对比度和过滤噪声等内容。步骤 3 可以包括您能想到的任何有用的计算,例如“墨水”(白色)像素的平均值和偏斜。第 4 步将利用您构建的模型将单个数字分类为“0”、“1”、...“9”,并且可以使用 k 最近邻、逻辑回归、SVM、神经网络等来完成。

I'd break down the basic recognition steps as follows:

  1. Locate meter display within the image
  2. Isolate and clean up the digits
  3. Calculate features
  4. Classify each digit using a model you've trained using historic examples

Assuming that the camera for a particular location does not move, step 1 will only need to be performed once. Step 2 will include things like enhancing contrast and filtering noise. Step 3 can include any useful calculations you can think of, such as mean and skew of "ink" (white) pixels. Step 4 would utilize a model you build to classify a single digit as '0', '1', ... '9', and could be accomplished using k-nearest neighbors, logistic regression, SVM, neural network, etc.

如何视而不见 2024-10-21 18:50:50

有几件事可以使 Predictor 的答案变得简单:将凸轮直接放置在仪表上方,添加足够的光线,也许在仪表周围放置明亮的粉红色条带以帮助分割显示:)。

一旦完成此操作,并且凸轮保持固定,您可以使用一次手动过程,然后将其应用于所有后续图像以分割出数字。如果照明良好且一致,您可能只需使用简单的模板匹配即可识别每个分段数字。

实际上,一旦获得所有数字的样本,您甚至可以根据更简单的东西对它们进行分类(例如阈值图片的总和)。

A couple of things would make 1 in Predictor's answere easy: Placing the cam directly above the meter, adding sufficient light, maybe placing bright pink strips around the meter to help segment out the display :).

Once you do this, and the cam remains fixed, you can use a manual process once and then have it applied to all subsequent images to segment out the digits. If the lighting is good and consistent, you might just be able to use simple template matching to identify each of the segmented digits.

Actually, once you get a sample of all the digits, you might even be able to classify them on something simpler (like sum of thresholded pictures).

你的呼吸 2024-10-21 18:50:50

最近,有很多对象检测方法可以用来处理这个问题。

In recently, there is many object detect method can be used to deal with this problem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文