查找车牌中每个字符的边界框
我使用 emgu 和 C# 来读取图像中的车牌。边缘检测后,我想找到其中每个字符的边界框,并使用神经网络来识别字符。我该怎么做呢? 谢谢
I use emgu with c# to read a license plate in an image. After edge detection I want find the bounding box of each character in it and use neural networks to recognize characters. How can I do it?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
既然可以检测车牌,最简单的方法就是寻找分界线。恐怕我只能从谷歌图像中推测伊朗车牌(如果这是您使用的),但是在每个字母之后都有一个中断和白色或黄色区域。
要查找各个字母的边界框:
您可以查看列的总和以及黄色或白色峰值的位置,并将其作为分界点。或者你可以只对黑色成分或文字求和,在理想情况下,你将从 0 开始找到黑色成分,然后返回到 0 计数,你就得到了你的字母。这里可能需要一些适应性统计数据。
[编辑]
从图像中分割车牌。首先查看每列的总和,您会注意到 255 * 车牌图像高度的峰值。使用它作为阈值,找到这些峰值的中间,您就得到了表示字母边缘的点。您可以使用此数据分割图像。
现在,峰值可能很难在统计上可靠地进行分割,但为了以防万一,它们不应该这样做。反转图像,使白色变为黑色,黑色变为白色。再次计算列的总和,在这种情况下,峰值是字母的位置,现在您寻找从 0 到 >1 的变化,并等待再次找到 0。记录发生这种情况的 x 位置将为您提供字母位置。如果需要的话,我会给你列总和的代码,但谷歌也会有你的答案,所需的统计数据就是你,只需翻译步骤即可。
另一种方法
将图像划分为单独的正方形或区域的另一种方法是学生最喜欢的方法,即简单地扫描车牌上的掩模。因此,您将第一个 ROI(0,0,100,100)输入到神经网络(NN)中,然后沿 y 轴移动一个(0,1,100,100)。继续此操作,直到读入所有数据。显然,神经网络会面临过度检测的风险,因为它可以对同一个字母进行多次分类,因此当您对字母进行分类时,您始终可以跳跃 20 个像素左右,从而消除错误分类。
显然,您需要减小车牌图像的大小以使此方法更快。我已经看到使用 9 x 9 数组进行准确的 OCR,但是您需要更大的使用量,您最好判断 20x20 应该足够了,但请看一下。
[编辑]
效率
哪一个更好?好吧,这取决于。它们都会起作用(取决于神经网络训练),但是寻找单个字母的边界框所涉及的方法可能很难可靠地建立。将所有数据输入神经网络的掩模扫描通常非常可靠,但效率可能非常低。如果您使用 20*20 图像,则需要将 400 个数据点输入到神经网络中,并且您必须将其乘以车牌宽度 -20。这将为您提供循环的最大迭代次数。
神经网络可能需要很长时间来训练,但也可以使用大量数据执行(取决于神经网络)。分割每个字母的方法更有效,因为您在神经网络上的消耗确实更少,并且可以将更准确的数据输入神经网络。
您面临的问题是,如果您使用 EMGU OCR 识别中内置的 OCR 引擎,速度是否非常快。正如您将在 EMGU 示例中看到的,决定最佳方法的唯一方法是编写并比较所有 3 种方法。如果您只需要一个有效的,请使用 NN 一个,并在其中获得匹配注释作为您的字母 ROI,因为您仍然知道沿车牌的 X 位置。
很抱歉,我无法就哪些最好的问题向您提供更直接的答案,但有两个因素可能会影响事情。
我希望其中一些方法有所帮助,
非常感谢
克里斯
Well the simplest method since you can detect the license plate is to look for the dividing lines. I'm afraid I can only speculate from google images for Iranian number plates (if this is what your using) however after each letter there is a break and a white or yellow area.
To Find the bounding boxes of the individual letters:
You could look at the sum of the columns and where there is a peak in yellow or white and take that as a dividing point. Or you could sum only the black components or the writing, in ideal circumstances you will start with 0 find black components and then return to a a count of 0 and you have your letter. A little adaptable statistics may be needed here.
[EDIT]
Segment the license plate from the image. Start by looking at the sum of each column, you will notice peaks of 255 * the height of the license plate image. Use this as your threshold, find the middle of these peaks and you have the point in which denotes a letter edge. You can segment your image using this data.
Now the peaks may be hard statistically to segment reliably, they shouldn't but just in case. Invert your image so your white is black and your black is white. Again take the sum of the columns in this case the peaks are the locations of the letters now you look for changes from 0 to >1 and wait until you find a 0 again. Recording the x position where this happens will give you your letter locations. I will give you the code for the sum of the columns if needed but google will also have your answer the statistics required are all you, just translate the steps.
An alternative method
An alternative to dividing the image into separate squares or regions and a favourite of students is simply to scan a mask across the license plate. So you feed into your Neural Network (NN) the first ROI say (0,0,100,100) then move one along the y axis (0,1,100,100). You continue this till you read in all your data. You obviously risk the NN from over detecting as it can classify the same letter so many times so when you classify a letter you can always jump 20 pixels or so removing false classifications.
Obviously your will need to reduce the size of the license plate image to make this method quicker. I have seen accurate OCR using 9 by 9 arrays however you will require larger use you best judgement 20x20 should be suffice but have a look.
[EDIT]
Efficiency
Which one is better? well it depends. They will all work (depending on NN tranining), however the methods involved in finding the bounding boxes of the individual letters can be hard to set up reliably. The scanning across of the mask feeding all data into the NN is usually quite reliable but can be incredibly inefficient. If you working with 20*20 images that's 400 data points to feed into the NN and you've got to times that by the license plate width -20. That will give you the maximum number of iterations through the loop.
NN can take a long time to train but also execute with large amounts of data (depending on NN). The method of segmenting each letter is more efficient as you really less on the NN and can feed more accurate data into your NN.
The problem you face is if your using the OCR engine already built into EMGU OCR recognition is extremely quick. As you will be able to see in the EMGU example the only way to decide on the best method is to write and compare all 3 methods. If you just need one that works then use the NN one and where you get a match note that as your letter ROI since you will still know the X position along the license plate.
I'm sorry I can't give you a more direct answer on which ones best but there are two many factors that could effect things.
I hope some of these methods help,
Many Thanks
Chris