OpenCV-Python 中的简单数字识别 OCR
我正在尝试在 OpenCV-Python (cv2) 中实现“数字识别 OCR”。它仅用于学习目的。我想学习 OpenCV 中的 KNearest 和 SVM 功能。
我有每个数字 100 个样本(即图像)。我想和他们一起训练。
OpenCV 示例附带了一个示例 letter_recog.py
。但我仍然不知道如何使用它。我不明白示例、响应等是什么。此外,它首先加载一个 txt 文件,我一开始不明白。
后来稍微搜索了一下,我可以在cpp样本中找到letter_recognition.data。我用它并在 letter_recog.py 的模型中为 cv2.KNearest 编写了代码(仅用于测试):
import numpy as np
import cv2
fn = 'letter-recognition.data'
a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })
samples, responses = a[:,1:], a[:,0]
model = cv2.KNearest()
retval = model.train(samples,responses)
retval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)
print results.ravel()
它给了我一个大小为 20000 的数组,我不明白它是什么。
问题:
1) letter_recognition.data 文件是什么?如何从我自己的数据集构建该文件?
2)results.reval()
表示什么?
3)我们如何使用letter_recognition.data文件(KNearest或SVM)编写一个简单的数字识别工具?
I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV.
I have 100 samples (i.e. images) of each digit. I would like to train with them.
There is a sample letter_recog.py
that comes with OpenCV sample. But I still couldn't figure out on how to use it. I don't understand what are the samples, responses etc. Also, it loads a txt file at first, which I didn't understand first.
Later on searching a little bit, I could find a letter_recognition.data in cpp samples. I used it and made a code for cv2.KNearest in the model of letter_recog.py (just for testing):
import numpy as np
import cv2
fn = 'letter-recognition.data'
a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })
samples, responses = a[:,1:], a[:,0]
model = cv2.KNearest()
retval = model.train(samples,responses)
retval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)
print results.ravel()
It gave me an array of size 20000, I don't understand what it is.
Questions:
1) What is letter_recognition.data file? How to build that file from my own data set?
2) What does results.reval()
denote?
3) How we can write a simple digit recognition tool using letter_recognition.data file (either KNearest or SVM)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
好吧,我决定锻炼自己的问题来解决上述问题。我想要的是使用 OpenCV 中的 KNearest 或 SVM 功能来实现一个简单的 OCR。以下是我所做的以及如何做的。 (它只是为了学习如何使用 KNearest 进行简单的 OCR 目的)。
1) 我的第一个问题是关于 OpenCV 示例附带的
letter_recognition.data
文件。我想知道该文件里面有什么。它包含一封信以及该信的 16 个特征。
以及
这个SOF
帮我找到了它。这16个特征在论文Letter中进行了解释使用 Holland 式自适应分类器进行识别
。(虽然最后我没有理解其中的一些功能)
2)因为我知道,如果不理解所有这些功能,很难做到那种方法。我尝试了一些其他论文,但对于初学者来说都有点困难。
所以我决定将所有像素值作为我的特征。 (我并不担心准确性或性能,我只是希望它能够工作,至少以最低的准确性)
我将下图作为我的训练数据:
(我知道训练数据量较少。但是,由于所有字母的字体和大小相同,我决定尝试一下)。
为了准备训练数据,我在 OpenCV 中编写了一段小代码。它执行以下操作:
手动按键
。这次我们自己按下与框中字母相对应的数字键。.txt
文件中。在数字的手动分类结束时,训练数据(
train.png
)中的所有数字都由我们自己手动标记,图像如下所示:下面是我用于上述目的的代码(当然,不是那么干净):
现在我们进入训练和测试部分.
对于测试部分,我使用了下图具有我在训练阶段使用的相同类型的字母。
对于训练,我们执行以下操作:
.txt< /code> 我们之前保存的文件
出于测试目的,我们执行以下操作:
我在下面的单个代码中包含了最后两个步骤(训练和测试):
它起作用了,下面是我得到的结果:
这里它的准确率是 100%。我认为这是因为所有数字都是相同种类和相同大小的。
但无论如何,这对于初学者来说是一个好的开始(我希望如此)。
Well, I decided to workout myself on my question to solve the above problem. What I wanted is to implement a simple OCR using KNearest or SVM features in OpenCV. And below is what I did and how. (it is just for learning how to use KNearest for simple OCR purposes).
1) My first question was about
letter_recognition.data
file that comes with OpenCV samples. I wanted to know what is inside that file.It contains a letter, along with 16 features of that letter.
And
this SOF
helped me to find it. These 16 features are explained in the paperLetter Recognition Using Holland-Style Adaptive Classifiers
.(Although I didn't understand some of the features at the end)
2) Since I knew, without understanding all those features, it is difficult to do that method. I tried some other papers, but all were a little difficult for a beginner.
So I just decided to take all the pixel values as my features. (I was not worried about accuracy or performance, I just wanted it to work, at least with the least accuracy)
I took the below image for my training data:
(I know the amount of training data is less. But, since all letters are of the same font and size, I decided to try on this).
To prepare the data for training, I made a small code in OpenCV. It does the following things:
key press manually
. This time we press the digit key ourselves corresponding to the letter in the box..txt
files.At the end of the manual classification of digits, all the digits in the training data (
train.png
) are labeled manually by ourselves, image will look like below:Below is the code I used for the above purpose (of course, not so clean):
Now we enter in to training and testing part.
For the testing part, I used the below image, which has the same type of letters I used for the training phase.
For training we do as follows:
.txt
files we already saved earlierFor testing purposes, we do as follows:
I included last two steps (training and testing) in single code below:
And it worked, below is the result I got:
Here it worked with 100% accuracy. I assume this is because all the digits are of the same kind and the same size.
But anyway, this is a good start to go for beginners (I hope so).
对C++代码感兴趣的可以参考下面的代码。
感谢Abid Rahman的精彩解释。
该过程与上面相同,但是轮廓查找仅使用第一层次轮廓,因此该算法仅使用每个数字的外轮廓。
创建样本和标签数据的代码
用于训练和测试的代码
结果
在结果中,第一行中的点被检测为 8,我们还没有对点进行训练。另外,我正在考虑第一层次结构中的每个轮廓作为样本输入,用户可以通过计算面积来避免它。
For those who interested in C++ code can refer below code.
Thanks Abid Rahman for the nice explanation.
The procedure is same as above but, the contour finding uses only first hierarchy level contour, so that the algorithm uses only outer contour for each digit.
Code for creating sample and Label data
Code for training and testing
Result
In the result the dot in the first line is detected as 8 and we haven’t trained for dot. Also I am considering every contour in first hierarchy level as the sample input, user can avoid it by computing the area.
我在生成训练数据时遇到了一些问题,因为有时很难识别最后选择的字母,所以我将图像旋转了 1.5 度。现在按顺序选择每个字符,训练后测试仍然显示 100% 的准确率。这是代码:
对于示例数据,我对脚本进行了一些更改,如下所示:
I had some problems to generate the training data, because it was hard sometimes to identify the last selected letter, so I rotated the image 1.5 degrees. Now each character is selected in order and the test still shows a 100% accuracy rate after training. Here is the code:
For sample data, I made some changes to the script, like this: