您建议如何从屏幕截图中识别所有字符?屏幕截图非常清晰(白色背景上只有黑色文本),我还可以为文本选择任何标准字体(安装在 Windows 上)。我尝试过一些 OCR 方法(Tesseract 等),但它在识别某些字符时出错(这让我感到困惑,因为文本没有丝毫噪音,而且字体是一些最常见的字体 - Courier New、Fixedsys 等),我需要它 100% 准确。是否有一些库可用于此特定目的,一些模式识别或其他东西?或者我应该使用某种等宽字体获取屏幕截图,并迭代图像移动到右侧 +font_size 像素,然后将捕获的内容与内存中的字母表示和相同大小的相同字体的数量进行比较?解决这个问题的最佳方法是什么?预先非常感谢您。
更新:通过使用等宽字体(Courier New)训练 Tesseract,我终于获得了 100% 的准确率,其大小与我截图的完全相同。希望对将来的人有所帮助:)
What would you recommend for recognizing all characters from a screenshot? The screenshot is perfectly clear (only black text on a white background), also I can choose any standard font for the text (installed on Windows). I have tried some OCR ways (Tesseract and such), but it made mistakes in recognizing some characters (that baffled me, as the text is without slightest noise, and the fonts were some most common ones - Courier New, Fixedsys etc.), and I need it to be 100% accurate. Is there some library available for this specific purpose, some pattern recognition or something? or should I get the screenshot with some monospaced font, and iterate through the image moving to the right +font_size pixels and then comparing captured thing to in-memory representation of letters and number of same font in the same size? What would be the best approach to this problem? Thank you very much in advance.
UPDATE: I've finally managed to get 100% accuracy by training Tesseract with monospaced font (Courier New) in exact size that I'm screenshotting. Hope that helps someone in the future :)
发布评论
评论(5)
由于这是 Google 上的第一个
tesseract recognize snapshot
结果,让我做一些死灵术并添加一个更简单的解决方案。Tesseract 要求图像 大约 300 dpi 或更高,Windows 的标准 dpi 为96. 这意味着您需要将图像重新缩放至 300%。之后,结果显着改善。
100%
结果:
您建议使用 Whal 来识别屏幕或 7 上的所有字符
200%
结果:
您建议如何从屏幕上识别所有房间?
300%
结果:
您建议如何识别屏幕截图中的所有字符?
任何高于 300% 的内容都可以。
Since this is the first result on Google for
tesseract recognize screenshot
, let me do bit of necromancy and add a much simpler solution.Tesseract expects images at around 300 dpi or more and standard dpi for Windows is 96. Which means you need to rescale the image to 300%. After that, the results improve dramatically.
100%
Result:
Whal would you recommend for recognizing all characters from a screensnor 7
200%
Result:
What would you recommend for recognizing all chamcters from a screenth ?
300%
Result:
What would you recommend for recognizing all characters from a screenshot ?
Anything above 300% works just as well.
如果 OCR 在如此优质的输入上给出如此糟糕的结果,我会感到惊讶。也许您想要做的是选择具有锐利边缘、无抗锯齿功能的字体,较大的字体也会有所帮助。
另外,如果可以接受,请尝试此问题中给出的 OCR 字体:
这应该会给你最好的结果 - 如果这没有达到 100%,那么我不知道什么会......
不知道你在 Tesseract 旁边尝试了什么,但是如果你没有,也许值得尝试其他一些。这些似乎是最近更新的(Tesseract 是一年前更新的):
也有一些在线版本,例如:
,您可以使用它来测试示例文档。通过此链接:
看来你可能需要商业化才能得到你想要的东西。
希望这有帮助。
I would be surprised if OCR would give so bad results on such a good quality input. Probably what you want to do is choose a font that has sharp edges, no anti-aliasing, bigger font size would also help.
Also, if acceptable, try the OCR font given in this SO question:
This should give you the best possible results - if this doesn't go 100%, then I don't know what will...
Don't know what you tried beside Tesseract, but if you did not, it might be worth trying some others. These seem to be updated recently (Tesseract was updated a year ago):
There are some online versions, too, such as:
that you can use to test a sample document. From this link:
it seems that you might need to go commercial to get what you want.
Hope this helps.
我知道您已经解决了您的问题,但以防万一这对其他人有帮助:我在处理屏幕截图时发现的两个问题是 OCR 引擎对以下内容敏感:(1) 图像文件标头中的分辨率设置不正确,以及 (2) 透明度问题(看起来像白色背景的东西实际上被标记为透明的)。由于某种原因,这些问题往往经常出现在屏幕截图图像中。
此外,除了 Tesseract 之外,另一种可能性是尝试 http://www.wisetrend.com/wisetrend_ocr_cloud 上的 API .shtml 基于 ABBYY OCR 引擎。 (优点是无需安装/配置/等任何东西来尝试它以确保它可以在您的图像上工作 - 只需进行 HTTP POST)。 免责声明:WiseTrend 是我公司的客户。
I know you already solved your problem, but in case this helps someone else: Two issues I found when dealing with screenshots is that OCR engines are sensitive to the following: (1) resolution incorrectly set in image file headers, and (2) transparency issues (what looks like white background is actually marked transparent). For some reason these problems tend to occur often in screenshot images.
Also, aside from Tesseract, another possibility is to try the API at http://www.wisetrend.com/wisetrend_ocr_cloud.shtml based on the ABBYY OCR engine. (The advantage is that there's nothing to install/configure/etc to try it to make sure it will work on your images - just make an HTTP POST). Disclaimer: WiseTrend is my company's customer.
您可以选择在操作系统级别更改文本抗锯齿功能吗?使用这些设置(或者甚至尝试将其关闭)也可能会给您现有的 OCR 带来更好的结果。
Do you have the option to change text anti-aliasing on the OS level? Playing around with those settings (or even trying to turn it off) might give you better result with existing OCRs too.
您可以使用 Abby Fine Reader 12.0 从 PDF 和/或屏幕截图图像中提取文本,并将其直接保存为您所需的文件格式。
通读:Abby Fine Reader 15 - 免费试用
You can use Abby Fine Reader 12.0 for text extraction from PDF's and or Screenshot Images and directly save them into your desired file format.
See through: Abby Fine Reader 15 - Free Trial