Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Tesseract 3.0.1中有这样的功能,来自trunk。 API 中添加了一个新类 - ResultIterator,它具有以下您感兴趣的功能:
ResultIterator
WordFontAttributes(bool* is_bold, bool* is_italic, bool* is_underlined, bool* is_monospace, bool* is_serif, bool* is_smallcaps, int* pointsize, int* font_id).
实际上您可以从 此处。
There is such function in Tesseract 3.0.1, from trunk. A new class is added to the API - ResultIterator, which has the following function you are interested in:
Actually you can see it yourself from here.
Tesseract 3.0x 基于 XML 的 hOCR 格式包含字符属性。你可能想尝试一下。
http://code.google.com/p/tesseract -ocr/问题/详细信息?id=377#c5
The Tesseract 3.0x's XML-based hOCR format includes character attributes. You may want to try that.
http://code.google.com/p/tesseract-ocr/issues/detail?id=377#c5
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
暂无简介
文章 0 评论 0
接受
发布评论
评论(2)
Tesseract 3.0.1中有这样的功能,来自trunk。 API 中添加了一个新类 -
ResultIterator
,它具有以下您感兴趣的功能:实际上您可以从 此处。
There is such function in Tesseract 3.0.1, from trunk. A new class is added to the API -
ResultIterator
, which has the following function you are interested in:Actually you can see it yourself from here.
Tesseract 3.0x 基于 XML 的 hOCR 格式包含字符属性。你可能想尝试一下。
http://code.google.com/p/tesseract -ocr/问题/详细信息?id=377#c5
The Tesseract 3.0x's XML-based hOCR format includes character attributes. You may want to try that.
http://code.google.com/p/tesseract-ocr/issues/detail?id=377#c5