我有一个扫描的 PDF 文件,我需要一个 VB.net 源代码,它将扫描的 PDF 转换为文本格式。
I have a Scanned PDF Files , i need a VB.net Source Code which convert that scanned PDF to text Format.
扫描的文件很可能没有文本,只有图像,因此您必须使用 OCR 工具来提取文本。
有几个 OCR 库,例如
开源 OCR
https://stackoverflow.com/questions/1085/free-ocr-library
Most likely the scanned file will not have the text but an image so you have to look at an OCR tool to get the text out.
There are several OCR libraries out there like
Open source OCR
看一下 http://snipt.org/lOgh/ - 它是用 C# 编写的(应该相对容易在 VB.NET 中重写),并使用可通过 API 访问的托管 OCR 解决方案
Take a look at http://snipt.org/lOgh/ - it's in C# (should be relatively easy to rewrite in VB.NET), and uses a hosted OCR solution accessible through an API
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
暂无简介
文章 0 评论 0
接受
发布评论
评论(2)
扫描的文件很可能没有文本,只有图像,因此您必须使用 OCR 工具来提取文本。
有几个 OCR 库,例如
开源 OCR
https://stackoverflow.com/questions/1085/free-ocr-library
Most likely the scanned file will not have the text but an image so you have to look at an OCR tool to get the text out.
There are several OCR libraries out there like
Open source OCR
https://stackoverflow.com/questions/1085/free-ocr-library
看一下 http://snipt.org/lOgh/ - 它是用 C# 编写的(应该相对容易在 VB.NET 中重写),并使用可通过 API 访问的托管 OCR 解决方案
Take a look at http://snipt.org/lOgh/ - it's in C# (should be relatively easy to rewrite in VB.NET), and uses a hosted OCR solution accessible through an API