http://code.google.com/p/tesseract-ocr/ has some wrapper to use it in .NET, or, simpler: http://www.codeproject.com/KB/office/modi.aspx but you need to keep an eye to the license since it is a part of the Office suite. In both case you tipically need some pre processing for the image and, as a solution I did in the past, some post processors that using some ehuristict correct the mistaked words.
发布评论
评论(1)
http://code.google.com/p/tesseract-ocr/ 有一些在 .NET 中使用它的包装器,或者更简单:
http://www.codeproject.com/KB/office/modi.aspx但您需要留意许可证,因为它是 Office 套件的一部分。在这两种情况下,您通常都需要对图像进行一些预处理,并且作为我过去所做的解决方案,一些后处理器使用一些启发式方法来纠正错误的单词。
http://code.google.com/p/tesseract-ocr/ has some wrapper to use it in .NET, or, simpler:
http://www.codeproject.com/KB/office/modi.aspx but you need to keep an eye to the license since it is a part of the Office suite. In both case you tipically need some pre processing for the image and, as a solution I did in the past, some post processors that using some ehuristict correct the mistaked words.