We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(8)
大多数商业 OCR 引擎将返回单词和字符坐标位置,但您必须使用他们的 SDK 来提取信息。即使 Tesseract OCR 也会返回位置信息,但获取起来并不容易。 3.01 版本将变得更容易,但 DLL 接口仍在开发中。
不幸的是,大多数免费 OCR 程序都使用 Tesseract OCR 的基本形式,并且它们仅报告原始 ASCII 结果。
www.transym.com - Transym OCR - 输出坐标。
www.rerecognition.com - KADMOS 引擎返回坐标。
Caere Omnipage、Mitek、Abbyy、Charactell 也返回角色位置。
Most commercial OCR engines will return word and character coordinate positions but you have to work with their SDK's to extract the information. Even Tesseract OCR will return position information but it has been not easy to get to. Version 3.01 will make easier but a DLL interface is still being worked on.
Unfortunately, most free OCR programs use Tesseract OCR in its basic form and they only report the raw ASCII results.
www.transym.com - Transym OCR - outputs coordinates.
www.rerecognition.com - KADMOS engine returns coordinates.
Also Caere Omnipage, Mitek, Abbyy, Charactell return character positions.
我正在使用 TessNet(Tesseract C# 包装器),并且使用以下代码获取单词坐标:
I'm using TessNet (a Tesseract C# wrapper) and I'm getting word coordinates with the following code:
您可以将
hocr
“configfile” 与 tesseract 一起使用,如下所示:将输出一个主要是 HTML5 的文档,其中包含以下元素:
虽然我很确定这不是您应该使用 XML 的方式,但我发现它比深入研究 tesseract API 更容易。
PS 我意识到有几个评论和答案都提到了这个解决方案,但它们都没有真正展示如何使用 hocr 选项或描述从中获得的输出。
You can use the
hocr
"configfile" with tesseract like so:This will output a mostly HTML5 document with elements like:
While I'm pretty sure that's not how you're supposed to use XML, I found it easier than digging into the tesseract API.
P.S. I realize that several comments and answers allude to this solution, but none of them actually show how to use the
hocr
option or describe the output you get from that.Google Vision API 就是这样做的。
https://cloud.google.com/vision/docs/detecting-text
Google Vision API does this.
https://cloud.google.com/vision/docs/detecting-text
您还可以查看 Gamera 框架 (http://gamera.informatik.hsnr.de/ )它是一组工具,可让您构建自己的 OCR 引擎。然而,最快的方法是使用 Tesseract 或 OCRopus hOCR (http://en.wikipedia.org/wiki/HOCR )输出。
You may also take a look at Gamera framework (http://gamera.informatik.hsnr.de/) it is a set of tools, which allows you to build your own OCR engine. Nevertheless the fastest way is to use Tesseract or OCRopus hOCR (http://en.wikipedia.org/wiki/HOCR) output.
对于 Java 开发人员:
我建议您使用 Tesseract 和 Tess4j。
实际上,您可以在 Tess4j 的一项测试中找到有关如何在图像上查找单词的示例。
https:/ /github.com/nguyenq/tess4j/blob/master/src/test/java/net/sourceforge/tess4j/TessAPITest.java#L449-L517
For Java Developers:
I will recommend for this you to use Tesseract and Tess4j.
You can actually find an example on how to find words on a Image in one of the tests of Tess4j.
https://github.com/nguyenq/tess4j/blob/master/src/test/java/net/sourceforge/tess4j/TessAPITest.java#L449-L517
ABCocr.NET(我们的组件)将允许您获取找到的每个单词的坐标。这些值可通过 Word.Bounds 属性访问,该属性仅返回 System.Drawing.Rectangle。
下面的示例展示了如何使用 ABCocr.NET 对图像进行 OCR 并输出您需要的信息需要:
披露:由 WebSupergoo 团队的一名成员发布。
ABCocr.NET (our component) will allow you to obtain the coordinates of each word found. The values are accessible through the Word.Bounds property, which simply returns a System.Drawing.Rectangle.
The example below shows how you can OCR an image using ABCocr.NET and output the information you need:
Disclosure: posted by a member of the WebSupergoo team.
hocr 是 tesseract OCR 引擎的输出格式之一,它既有单词及其坐标,也有一些附加信息,例如单词识别的置信度。
hocr is a one of the output format of tesseract OCR engine,which has both word and it's coordinates and also has some additional info like confident level of word recognition.