您可以使用 MODI OCR 识别非语言特定项目吗?
我已经对图像进行了文档 OCR 处理,当页面上有“咖啡”或“432”等单词时,效果很好,但是当我尝试对“abc123”等单词进行 OCR 时,我收到“OCR 运行错误” 。
MODI.Document md = new MODI.Document();
md.Create("c:\\temp\\mpk.tiff");
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); // <-- Error thrown here
MODI.Image image = (MODI.Image)md.Images[0];
FileStream createFile = new FileStream("c:\\temp\\mpk.txt", FileMode.CreateNew);
StreamWriter writeFile = new StreamWriter(createFile);
writeFile.Write(image.Layout.Text);
writeFile.Close();
md.Close();
当然,MS 构建这个库不是为了仅识别基于语言的单词吗?或者他们有吗?我是否缺少 MODI.document 设置或其他内容?
任何帮助将不胜感激,
I've got document OCR working on an image, works fine when there are words like "coffee" or "432" on the page, but when I try to OCR a word like "abc123", I get an "OCR Running Error".
MODI.Document md = new MODI.Document();
md.Create("c:\\temp\\mpk.tiff");
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); // <-- Error thrown here
MODI.Image image = (MODI.Image)md.Images[0];
FileStream createFile = new FileStream("c:\\temp\\mpk.txt", FileMode.CreateNew);
StreamWriter writeFile = new StreamWriter(createFile);
writeFile.Write(image.Layout.Text);
writeFile.Close();
md.Close();
Surely MS didn't build this library to only recognize language based words? Or did they? Am I missing a MODI.document setting or something?
Any Help would be appreciated,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,他们做到了。如果没有相关词典和不提供上下文的片段,OCR 就会变得非常不准确。人类也是如此:ABC123、ABCI23、ABCl23。三个不同的字符串。在实践中,这个问题是通过使用特殊字体来解决的,这些字体可以最大限度地减少字母和数字不明确的可能性,就像你在银行支票上看到的那样。
Yes they did. OCR gets really inaccurate without a relevant dictionary and fragments that don't provide context. So do humans: ABC123, ABCI23, ABCl23. Three different strings. This is solved in practice by using special fonts that minimize the odds that letters and numbers are ambiguous, the kind you see on a bank check.