OCR方格纸
我想将扫描的方格纸笔记本(带有手写)的 PDF 格式转换为文本文件。
我怎样才能做到这一点?
谢谢
I would like to take a pdf of a scanned graph paper notebook (with handwriting) and turn it into a text file.
How can I do this?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
查看 OCR 库,例如 OCRopus。 我认为它不需要 PDF,因此您可能必须先将其转换为 TIFF 或 JPEG。
Check out an OCR library, like OCRopus. I don't think it takes PDF, so you may have to convert it to a TIFF or JPEG first.
有一些转换打字的 OCR 库(OCRopus、tesseract 等),
还有基于 Java 的手写库。 我不确定 OCRopus 是否具有这种能力,我正在研究进行手写识别的一个库是:
在线视频
Java 神经网络
可以想象,您可以获取 pdf,如果需要的话将其转换为 tiff(根据软件),它会给您一些东西..
祝您好运!
There are OCR libraries that convert typing (OCRopus, tesseract, etc.)
There are also Java based handwriting libraries. I am not sure if OCRopus has that ability, one library I was looking into to do handwriting recognition was:
Online Video
Java Neural Networks
Conceivably you could take the pdf, convert it into a tiff if need be (according to the software), and it would give you something..
Good luck!
如果笔记本是 PDF 文件,您可以通过电子邮件将其发送到 Gmail 帐户,然后 Gmail 允许您在浏览器中以 HTML 文件形式“查看”PDF。 页面仍然是图像。
如果您想要从中提取文本,OCR 可能会起作用,但也可能无法从中提取文本。
If it is the notebook as a PDF file you could e-mail it to a gmail account and then gmail allows you to "view" the PDF from within your browser as an HTML file. Still the pages remain images.
If you would like the text out of it OCR might work but it may also be uncapable of getting the text out of it.