如何在 ruby/rails 3 中复制/读取 pdf 文件
我需要读入和写入 pdf 文件,并且在某些情况下将我拥有的 pdf 与我需要读入的 pdf 合并。我尝试使用 pdf-reader gem 和 prawn gem 来完成此操作。
pdf-reader gem 似乎不允许直接复制文件,而是仅从文件中提取文本而不进行格式化或图像,除非您单独写入它们。即使如此,它也只会提取部分文件并跳过其他文件。红宝石还有其他东西吗?
编辑:更具体地说,在某些情况下,我需要 pdf 的精确副本,而在其他情况下,我需要将一个副本覆盖在另一个副本之上。 pdf-reader 或 docsplit 似乎都无法制作副本(以文本/格式/图像/字体等形式读取)。
I need to both read in and write pdf files and in some cases merge a pdf I have with the one I need to read in. I attempted to do this with the pdf-reader gem and the prawn gem.
The pdf-reader gem doesn't seem to allow straight copying of a file, but instead just pulling text out of files without formatting or images unless you write them in seperately. And even then it only pulls out pieces of files and skips others. Anything else out there for ruby ?
Edit: To be more specific, in some cases I need an exact copy of the pdf, and in others I will need to overlay the copy of one on top of the copy of another. Neither pdf-reader or docsplit seem to be able to make a copy (read in text/formatting/images/fonts etc).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
https://github.com/paulschreiber/pdf-merger
https://github.com/paulschreiber/pdf-merger
试试这个:Prawn:Fast, Nimble PDF Generation For Ruby
https://github.com/sandal/prawn
try this: Prawn: Fast, Nimble PDF Generation For Ruby
https://github.com/sandal/prawn
经过一番尝试后,我发现了这个问题:覆盖一个pdf 或 ps 文件叠加在另一个文件之上
看来 pdftk 已经尽善尽美了。
Rails 有一个 gem,但它看起来不支持覆盖:http://pdf-toolkit。 rubyforge.org/
编辑:看起来有一个更好的宝石存在,但未发布,支持覆盖: https://github.com/tcocca/active_pdftk
After playing around a lot I found this question: overlay one pdf or ps file on top of another
Seems that pdftk is as good as its going to get.
Rails has a gem for this but it looks like it doesn't support overlay: http://pdf-toolkit.rubyforge.org/
EDIT: Looks like a much better gem is out there but unpublished that supports overlay: https://github.com/tcocca/active_pdftk
您可以使用 Lucene 或 Solr(具有 Ruby on Rails 的钩子)来索引和读取 .pdf 文件以及 Microsoft 文档(即 Word、PowerPoint、Excel)。
You can use Lucene or Solr (has hooks for Ruby on Rails) to index and read in .pdf files as well as Microsoft documents (i.e. Word, PowerPoint, Excel).