如何制作 OCR 程序?

发布于 2024-11-18 05:33:10 字数 1436 浏览 0 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

謸气贵蔟 2024-11-25 05:33:10

基本方法是制作黑色像素的直方图。第一:将所有像素投影在一条线上。直方图中的深谷表示线条之间的分离(如果纸张可能倾斜,请尝试不同的角度)。然后,每行(或每页,如果您知道字体是等宽字体)将像素投影到水平直方图上。这将为您提供字符间空间的强烈指示。至少,这会为您提供平均字符高度和宽度的值,这将有助于您执行后续步骤。

之后,您需要处理字距调整(字符重叠的地方)。找到连接的像素,可能首先对图像进行膨胀或腐蚀以补偿扫描伪影。

根据扫描图像的质量,您可能需要使用更先进的技术,但这会让您继续前进。

A basic approach is to make a histogram of black pixels. First: project all pixels on a line. The deep valleys in the histgram indicate separation between lines (try different angles if the paper might be tilted). Then, per line (or per page if you know the font is monospaced) project the pixels on a horizontal histogram. This will give you a strong indication of inter character spaces. As a minimum this gives you a value for the average character height and width that will help you in next steps.

After that, you need to take care of kerning (where characters overlap). Find the connected pixels, possibly by first doing dilatation or erosion on the image to compensate for scanning artifacts.

Depending on the quality of the scan image you may have to use more advanced techniques, but this will get you going.

刘备忘录 2024-11-25 05:33:10

这听起来不像人工智能,听起来像是您在谈论 OCR:

http://en。 wikipedia.org/wiki/Optical_character_recognition

请参阅 google tesseract

http://code.google.com/p/tesseract-ocr/

编辑未经编辑的问题是关于人工智能的。

This doesn't sound like artificial intelligence, it sounds like you're talking about OCR:

http://en.wikipedia.org/wiki/Optical_character_recognition

See google tesseract

http://code.google.com/p/tesseract-ocr/

EDIT The unedited question was asking about artificial intelligence.

岁月流歌 2024-11-25 05:33:10

对我来说,这个问题本身似乎并不明确。

当它谈论 OCR 时,将在这里留下几篇文章,它们可能会有所帮助(它们至少对我有帮助):

如上所述 tesseract 是一个很好的 OCR 开源 python 库(我个人也使用的那个)。您可以采取的其他方法是通过 sklearn

您可能还需要检查 这篇 stackoverflow 帖子

我也很确定您可以使用 researchgate 来检查那里的任何论文(我发现了一些,只是不确定这是否是您所需要的)

我认为上述通用答案适合通用问题。

To me the question per se does not seem clear.

As it talks about OCR will leave a couple of articles here that they may help (they help me at least):

Also as mentioned above tesseract is a good OCR open-source python library (the one that i personally use as well). Other approaches that you may take is through sklearn

You may also want to check this stackoverflow post.

I am also pretty sure that you can use researchgate to check for any papers out there (I found some, just not sure if this is what you need)

I think that the above generic answer suits the generic question.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文