当前位置：文江博客话题详情

使用 Tesseract 界面进行 OCR

发布于 2024-07-04 00:23:41 字数 72 浏览 8 评论 0原文

如何在 C# 中使用 Tesseract 的界面 OCR tiff 文件？
目前我只知道如何使用可执行文件来做到这一点。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

神经暖 2024-07-11 00:23:41

免责声明：我在 Atalasoft 工作，

我们的 OCR 模块支持 Tesseract，如果事实证明不支持足够好了，您可以升级到更好的引擎，只需更改一行代码（我们为多个 OCR 引擎提供通用接口）。

回复收藏 0 原文

ζ澈沫 2024-07-11 00:23:41

我今天发现 EMGU 现在包含一个 Tesseract 包装器。虽然 opencv 库的非托管 dll 的数量可能看起来有点令人畏惧，但这并不是快速复制到输出目录无法解决的问题。从那里开始，实际的 OCR 过程就像三行一样简单：

Tesseract ocr = new Tesseract(Path.Combine(Environment.CurrentDirectory, "tessdata"), "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY);
this.ocr.Recognize(clip);
optOCR.Text = this.ocr.GetText();

“robomatics”放在一起一个非常好的 YouTube 视频演示了一个简单但有效的解决方案。

I discovered today that EMGU now includes a Tesseract wrapper. While the number of unmanaged dlls of the opencv lib might seem a little daunting, it's nothing that a quick copy to your output directory won't cure. From there the actual OCR process is as simple as three lines:

Tesseract ocr = new Tesseract(Path.Combine(Environment.CurrentDirectory, "tessdata"), "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY);
this.ocr.Recognize(clip);
optOCR.Text = this.ocr.GetText();

"robomatics" put together a very nice youtube video that demonstrates a simple but effective solution.

回复收藏 0 原文

初见终念 2024-07-11 00:23:41

C# 程序启动 tesseract.exe，然后读取 tesseract.exe 的输出文件。

Process process = Process.Start("tesseract.exe", "out");
process.WaitForExit();
if (process.ExitCode == 0)
{
    string content = File.ReadAllText("out.txt");
}

C# program launches tesseract.exe and then reads the output file of tesseract.exe.

Process process = Process.Start("tesseract.exe", "out");
process.WaitForExit();
if (process.ExitCode == 0)
{
    string content = File.ReadAllText("out.txt");
}

回复收藏 0 原文