分割文档图像
我正在寻找一种用于检测文档图像中的线条(例如表格)和单词边界框的算法。
目前,我通过执行交替的水平和垂直投影并检查生成的直方图是否有间隙来分割图像。虽然这适用于某些文档,但不适用于那些包含外部有线条的表格的文档,因为直方图不包含允许进一步分段的间隙。因此我正在寻找一种更复杂的算法。
I'm looking for an algorithm for detecting lines (e.g. from tables) and word bounding boxes in document images.
Currently I am segmenting the image by performing alternating horizontal and vertical projections and checking the resulting histogram for gaps. While this works for some documents, it doesn't for those that contain tables with lines on the outside, as the histogram then contains no gaps that would allow a further segmentation. Therefore I am looking for a more sophisticated algorithm.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不确定我完全理解你的问题。如果您添加您正在谈论的图像,那就更好了。
无论如何,使用 cvHoughLines 函数来检测图像中的线条。
另外,opencv 附带了一个检测正方形的示例。稍微修改一下以检测单词边界框。
Not sure I understood your question completely. It would be better if you add the image you are talking about.
Any way, Use cvHoughLines function to detect lines in image.
Also, opencv comes with a sample to detect squares. Modify it a little to detect word bounding boxes.