将 PDF 转换为 HTML
可能的重复:
将 PDF 转换为 HTML
我需要将以 PDF 格式上传的简历转换为 HTML。我已经使用 livedocx.com 转换 doc 和 docx 格式,但它们不支持从 PDF 转换。我已经阅读了有关此问题的堆栈溢出的其他帖子,标准解决方案是安装 pdf2html 命令行工具。但这对我来说不是一个选择,因为这是一个共享托管服务器,我不是它的管理员。主机不会帮助我安装该工具,因此我要么需要第三方服务,要么需要一种干净的方法来使用本机 PHP 来完成此操作。 PHP 版本是 5.2,运行在最新的 CentOS 上。请帮忙!
克里斯
Possible Duplicate:
Convert PDF to HTML
I need to convert resumes that are uploaded in PDF format to HTML. I am already converting doc and docx formats using livedocx.com, but they don't support converting from PDF. I have already read the other posts on stack overflow regarding this matter and the standard solution is the install the pdf2html command line tool. This is not an option for me however since this is a shared hosting server which I am not an administrator of. The host will not help me by installing the tool, so I either need a third-party service or a clean way to do this with native PHP. PHP version is 5.2 running on latest CentOS. Please help!
Chris
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
CentOS 应该默认安装 pdftohtml。这就是要使用的工具。如果由于某种原因您的托管提供商已删除它,那么您需要重新安装它。最好的办法是通过操作系统的包管理器安装它。如果您有 ssh 访问权限,则登录并安装它:
如果您没有 ssh 访问权限并且服务提供商不愿意为您安装它,
我猜唯一的选择是找到一个类似于您用于 doc/docx 的网络服务。不知道什么是“好”,但这就是谷歌的目的。
CentOS should have pdftohtml installed by default. That's the tool to be using. If for some reason your hosting provider has removed it then you need to reinstall it. The best thing is for it to be installed through os's package manager. If you have ssh access then log in and install it with:
If you don't have ssh access and the service provider isn't willing to install it for you,
guess the only option would be to find a web service similar to what you're using for doc/docx. Don't know of 'a good one', but that's what google's there for.
另一种不太优雅的解决方案是使用 Ghostscript(更有可能是预安装的)将 PDF 转换为 PNG 图像,然后显示它们。这样做的优点是能够处理更多的 PDF 文件,并且布局将保持完美,但将全部是图像。
Another, less elegant solution, would be to use
ghostscript
(which is more likely to be pre-installed) to convert the PDF to PNG images, then display these. This has the advantage of being able to work on more PDF files, and the layout will be kept perfectly, but it will be all images.