专为屏幕阅读而设计的 OCR 引擎

发布于 2024-09-11 06:37:39 字数 200 浏览 8 评论 0原文

是否有任何 OCR 引擎旨在识别屏幕捕获图像中的文本而不是扫描文本?我有一个项目,需要检索和识别应用程序中的文本,到目前为止我尝试过的 OCR 引擎都不能很好地处理屏幕截图。

理想情况下,引擎应该能够很好地处理颜色和背景噪音,尽管如果没有类似的东西可用,我可以做一些调整。

它需要与 .NET 兼容;用 .NET 编写或具有 .NET 可调用的 API。

Are there any OCR engines designed for identifying text in screen-captured images rather than scanned text? I have a project where I need to retrieve and identify text in an application, and none of the OCR engines I've tried so far have faired well with screenshots.

Ideally the engine should work well with color and with background noise, although I can make some allowances if nothing like that is available.

It will need to be .NET compatible; either written in .NET or having a .NET-callable API.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

丢了幸福的猪 2024-09-18 06:37:40

我发现 Tesseract OCR 对于开源项目来说非常可靠。我发现它甚至可以读取和解码简单的验证码,例如 Megaupload 的验证码。我认为只要稍加调整,这可能会效果很好。

唯一的缺点是它只接受未压缩的 TIFF 图像,这可能很烦人。

编辑:Philip Daubmeier 已经找到了 .NET 集成,但下面是将位图转换为未压缩的 TIFF 的代码。

private void ConvertBitmapToTIF(Bitmap convert)
{
    ImageCodecInfo codecInfo = GetEncoderInfo("image/tiff");
    System.Drawing.Imaging.Encoder encodeCom = System.Drawing.Imaging.Encoder.Compression;
    System.Drawing.Imaging.Encoder encodeBPP = System.Drawing.Imaging.Encoder.ColorDepth;

    EncoderParameters parms = new EncoderParameters(2);
    EncoderParameter param0 = new EncoderParameter(encodeCom, (long)EncoderValue.CompressionNone);
    EncoderParameter param1 = new EncoderParameter(encodeBPP, 8L);
    parms.Param[0] = param0;
    parms.Param[1] = param1;

    convert.Save("output.tif", codecInfo, parms);
}

这将保存到文件,但 Bitmap.Save< /a> 方法也可以写入流。

I've found Tesseract OCR to be pretty solid for an open source project. I've found that it can even read and decode simple captchas, like Megaupload's. I'd think with a little tweaking this could work pretty well.

The only pain is that it only accepts uncompressed TIFF images, which can be annoying.

EDIT: Philip Daubmeier already found a .NET integration, but below is code to convert a Bitmap to uncompressed TIFF.

private void ConvertBitmapToTIF(Bitmap convert)
{
    ImageCodecInfo codecInfo = GetEncoderInfo("image/tiff");
    System.Drawing.Imaging.Encoder encodeCom = System.Drawing.Imaging.Encoder.Compression;
    System.Drawing.Imaging.Encoder encodeBPP = System.Drawing.Imaging.Encoder.ColorDepth;

    EncoderParameters parms = new EncoderParameters(2);
    EncoderParameter param0 = new EncoderParameter(encodeCom, (long)EncoderValue.CompressionNone);
    EncoderParameter param1 = new EncoderParameter(encodeBPP, 8L);
    parms.Param[0] = param0;
    parms.Param[1] = param1;

    convert.Save("output.tif", codecInfo, parms);
}

This saves to a file, but the Bitmap.Save method can write to a stream also.

标点 2024-09-18 06:37:40

通常,OCR 技术经过调整以处理扫描文本,其分辨率至少为 200 dpi,但为了获得可靠的 OCR 质量,建议使用 300 dpi。因此,您需要花一些精力来调整设置和所有内容,以使其在屏幕文本(通常被认为接近 96 dpi)上运行。

ABBYY 有屏幕截图 OCR 软件:http://www.abbyy.com/screenshot_reader/ 证明它的技术能够在这种条件下很好地工作。我用它,它确实有效。因此,您可能需要联系 ABBYY 获取 OCR SDK:http://www.abbyy.com/ocr_sdk/(可以从.NET使用)

它并不便宜,但很有效。免责声明:我为 ABBYY 工作

Usually OCR technolgy is tuned to work with scanned text, which is at at least 200 dpi, however 300 dpi is recommended for reliable OCR quality. Thus you need to put some efforts into tweaking settings and everything to make it work on screen text, which is typically considered to be something near to 96 dpi.

ABBYY has screen shot OCR software: http://www.abbyy.com/screenshot_reader/ which proves that its technology is able to work in this conditions well. I use it, it just works. Thus you may want to contact ABBYY for OCR SDK: http://www.abbyy.com/ocr_sdk/ (can be used from .NET)

It is not cheap, but it works. Disclaimer: I work for ABBYY

舟遥客 2024-09-18 06:37:40

您本质上是在寻找各种研究人员尝试过的验证码规避工具,其中一些取得了成功。

另一种方法是使用平滑算法对 96 DPI 捕获进行插值并将其转换为 300 DPI(例如,对其进行 Photoshop),然后使用标准 OCR 工具。

You're essentially looking for the CAPTCHA circumvention tools various researchers have tried, some with success.

Another approach would be to use smoothing algorithms to interpolate 96 DPI captures and convert them to 300 DPI (eg, photoshop it), then use standard OCR tools.

笔芯 2024-09-18 06:37:40

使用第一个答案(OCR 软件),对于屏幕捕获,您可能可以发送 PRNTSCRN(打印屏幕)字符,然后将剪贴板(bmp)的内容转换为 tiff。

希望这对您的创业有更多帮助

Use the first answer (OCR software) and for the screen capture you could probably send a PRNTSCRN (printscreen) character and then CONVERT the content of the clipboard(bmp) into a tiff.

hope this help you a little more into your venture

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文