OpenCV中的行、列检测(OCR预处理)
首先,我的最终目标是用超正方体处理以下图像: http://ubuntuone.com/72m0ujsL9RhgfMIlugRDWP (我擦掉了第二列和第三列......)
但是,超立方体在虚线背景方面存在问题。所以我的想法是用OpenCV对图像进行预处理。最好的是,如果我能够以某种方式检测每条线,因为我需要通过应用与偶数线不同的阈值来删除虚线背景。有什么办法可以解决我的问题吗?到目前为止,我已经找到了霍夫变换,也许还有分割,但结果不是很好(可能是因为参数错误)......但我不确定这些是否是可能的方法以及我在哪些方面投入了最多的时间。 列检测也很好,因为第二列仅包含数字和第三个字符。将这些“知识”传递给 tesseract 可以进一步提高其检测率。
如果有人能给我一些如何解决这个问题的提示以及哪些 OpenCV 函数最适合使用以及哪些参数,我将非常感激。一些能让我对不同步骤有一个大致了解的片段也会很有帮助。
提前致谢!!!
亲切的问候。
first my final goal is to process the following image with tesseract:
http://ubuntuone.com/72m0ujsL9RhgfMIlugRDWP
(I wiped out the second and the third column...)
However tesseract has problems with the dotted background. So my idea is to pre-process the image with OpenCV. The best would be if I could somehow detect each line, because I need to remove the dotted background by applying a different threshold than to even lines. Is there any solution to solve my problem? So far I have found Hough Transformation and maybe segmentation, but the results weren't very good (maybe because of wrong parameter)... But I'm not sure, if these are possible approaches and what I invest my time best on.
Column detection would be fine, too, because the second column contains numbers and the third characters, only. Passing this "knowledge" to tesseract could improve its detection rate even more.
I would be really thankful if somebody could give me some hints how to solve this issue and which OpenCV functions are used best, with which paremeters. Some snippets that give me a fair idea about the different steps would be helpful, too.
Thank in advance!!!
Kind regards.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我建议您使用腐蚀之类的东西,因为与字母的宽度相比,点似乎相当小。
或者我会使用适当的阈值进行 Canny 边缘检测,以便我会丢弃点的相当短且薄的边缘。
希望这有帮助,玩得开心!
I would suggest you use something like erosion, as the dots seem to be rather small as compared to the width of the letters.
OR I would Canny edge detection with proper thresholds so that I would discard the rather short and thin edges of the dots.
Hope this helps, have fun!