ocr 处理应用程序文本(未扫描,未验证码)
我想通过读取应用程序显示的文本来连接应用程序。
当 Windows 不进行任何字体平滑时,我在某些应用程序中取得了成功,方法是手动输入短语,以所有 Windows 字体渲染它,然后找到匹配项 - 从那里我可以通过生成所有字母图像将每个字母图像映射到一个字母字体中的字母。
但是,如果 Windows 或应用程序正在进行任何字体平滑处理,则此方法将不起作用。 OCR 计算机生成文本的最新技术水平如何?看起来它应该比破解验证码或 OCR 扫描文本更容易。我在哪里可以找到这方面的资源?到目前为止,我只找到有关验证码破解或 OCR 扫描文本的文章。
我更喜欢从 Python 轻松访问的解决方案,但如果其他语言中有一个好的解决方案,我会做接口工作。
I'd like to interface an application by reading the text it displays.
I've had success in some applications when windows isn't doing any font smoothing by typing in a phrase manually, rendering it in all windows fonts, and finding a match - from there I can map each letter image to a letter by generating all letters in the font.
This won't work if any font smoothing is being done, though, either by Windows or by the application. What's the state of the art like in OCRing computer-generated text? It seems like it should be easier than breaking CAPTCHAs or OCRing scanned text. Where can I find resources about this? So far I've only found articles on CAPTCHA breaking or OCRing scanned text.
I prefer solutions easily accessible from Python, though if there's a good one in some other lang I'll do the work to interface it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不太确定你的意思,但我认为用 OCR 程序阅读文本就可以了。
Tesseract 对于扫描文档的准确度令人惊讶,因此特定的字体对于它来说阅读起来会轻而易举。这是我的 Python OCR 解决方案:Linux 中的 Python OCR 模块?。
但是您可以将每个字符生成为图像并找到图像上的位置。它(可能)有效,但我不知道平滑后它的准确度如何。
I'm not exactly sure what you mean, but I think just reading the text with an OCR program would work well.
Tesseract is amazingly accurate for scanned documents, so a specific font would be a breeze for it to read. Here's my Python OCR solution: Python OCR Module in Linux?.
But you could generate each character as an image and find the locations on the image. It (might) work, but I have no idea how accurate it would be with smoothing.