检测图像上是否存在文本的算法
在我的新任务中,我正在寻找一种方法来检测图像上文本的存在。该图像是地图 - 例如可以是谷歌地图。任务是检测街道/城市标签的放置位置。
我知道opencv库有可以检测特征(例如人脸)的算法 - haar分类器或hog(定向梯度直方图),但我听说此类算法的学习过程相当困难。
您知道有什么算法、方法或库可以做到这一点(检测图像上文本的存在)吗?
谢谢, 约翰
With my new assignment I am looking for a method to detect the presence of text on image. The image is a map - can be for example google map. The task is to detect where the street/city label is placed.
I know that opencv library has algorithm that can detect features (for example human faces) - haar classifier or hog (histogram of oriented gradients), but I heard that learning process of such algorithms is quite difficult.
Do you know of any algorithm, method or a library that could do that (detect presence of text on image)?
Thanks,
John
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
视觉中有一个标准问题,称为图像中的文本检测。它与 OCR 完全不同。 OCR 关注的是它所说的内容,而文本检测则是确定图像中是否有文本。 Adi Shavit 的第三个链接是解决这个问题的方法。您可以查看 Google Scholar 上引用较多的文章 文本检测。
There is a standard problem in vision called text detection in images. it is quite different to OCR. OCR concerms itself with what it says, while text detection is about determining if there is text in the image. Adi Shavit's third link is a method to address this problem. You can look on google scholar well cited articles on text detection.
您可以采取多种可能的方法。
2017 年 1 月更新
OpenCV 3.2 contrib 模块现在有一个文本检测模块。
它还包括一个示例(C++, Python)了解如何使用它。
There are several possible approaches you can take.
UPDATE Jan. 2017
The OpenCV 3.2 contrib module now has a text detection module.
It also includes a sample (C++, Python) of how to use it.
您需要将其调整为特定类型的地图图像,否则问题将非常困难(请参阅上一篇有关文章链接的文章)。
OCR 是可行的方法,您应该使用现有的库。然而,OCR 主要针对白底文本进行。要将您的问题简化为常规 OCR 问题,您应该尝试处理地图的颜色空间。地图文本可能具有非常特定的颜色,这可能足以找到这些像素。然后,您可以根据连接区域的大小过滤检测到的像素。
如果您只想查找文本标签的位置,则可以执行上述操作,并且几乎只需跳过 OCR 步骤即可。如果标签不是太接近,可以使用简单的聚类算法来找到它们各自的位置。
You need to tune this to a specific type of map images, or the problem is going to be very difficult (see the previous post about links to articles).
OCR is the way to go, and you should use an existing library. However, OCR is mainly done on text on white backgrounds. To reduce your problem to a regular OCR problem, you should attempt to work on the color space of the map. Likely the map text has a very specific color and this may be enough to find these pixels. You can then filter the detected pixels based on the size of connected regions.
If you literally only want to find the locations of text labels, you can do the above, and pretty much just skip the OCR step. If the labels are not too close, simple clustering algorithms can be used to find their respective positions.