我一直在网络上寻找图像中的数字识别资源。我发现许多链接提供了有关该主题的大量资源。但不幸的是,这比帮助更令人困惑,我不知道从哪里开始。
我有一张包含 5 个数字的图像,不受干扰(没有验证码或类似的东西)。这些数字为白底黑字,以标准字体书写。
我的第一步是将数字分开。我目前使用的算法非常简单,它只是检查一列是否完全是白色的,因此是一个空格。然后它会修剪每个字符,使其周围没有白色边框。这非常有效。
但现在我对数字的实际识别感到困惑。我不知道猜测正确答案的最佳方法是什么。我不认为直接与字体进行比较是一个好主意,因为如果数字只有一点点不同,那就不起作用了。
谁能给我一个关于如何完成此操作的提示?
这对问题来说并不重要,但我将用 C# 或 Java 来实现它。我找到了一些可以完成这项工作的库,但我想自己实现它,以学习一些东西。
I've been searching for resources for number recognition in images on the web. I found many links providing lots of resources on that topic. But unfortunately it's more confusing than helping, I don't know where to start.
I've got an image with 5 numbers in it, non-disturbed (no captcha or something like this). The numbers are black on a white background, written in a standard font.
My first step was to separate the numbers. The algorithm I currently use is quite simple, it just checks if a column is entirely white and thus a space. Then it trims each character, so that there is no white border around it. This works quite well.
But now I'm stuck with the actual recognition of the number. I don't know what's the best way of guessing the correct one. I don't think directly comparing to the font is a good idea, because if the numbers only differ a little, it will no more work.
Could anyone give me a hint on how this is done?
It doesn't matter to the question, but I'll be implementing this in C# or Java. I found some libraries which would do the job, but I'd like to implement it myself, to learn something.
发布评论
评论(1)
为什么不考虑使用开源 OCR 引擎,例如 Tesseract?
http://code.google.com/p/tesseract-ocr/
< s>Tesseract 的 C# 包装
http:// www.pixel-technology.com/freeware/tessnet2/
Tesseract 的 Java 包装
http://sourceforge.net/projects/tessocrinjava/虽然您可能不会考虑使用第三方库来自己实现它,但需要做大量的工作只是集成第三方工具。还要记住,看似简单的事情(识别数字 5 与数字 6)通常非常复杂;我们谈论的是成千上万行复杂的代码。至少,看看 tesseract 的源代码,它会给你一个充分的理由想要利用第三方库。
这是另一个问题,可以为您提供有关所涉及算法的一些想法:https ://stackoverflow.com/questions/850717/what-are-some-popular-ocr-algorithmsWhy not look at using an open source OCR engine such as Tesseract?
http://code.google.com/p/tesseract-ocr/
C# Wrapper for Tesseracthttp://www.pixel-technology.com/freeware/tessnet2/
Java Wrapper for Tesseract
http://sourceforge.net/projects/tessocrinjava/While you might not consider using a third-party library as implementing it yourself, there's a tremendous amount of work that goes into just integrating the third-party tool. Keep in mind also that something that may seem simple (recognizing the number 5 versus the number 6) is often very complex; we're talking thousands and thousands of lines of code complex. In the least, look at the source code for tesseract and it'll give you a good reason to want to leverage a third-party library.
Here's another SO question that'll give you some ideas about the algorithms involved: https://stackoverflow.com/questions/850717/what-are-some-popular-ocr-algorithms