为 Heroku 静态编译 pdftk。需要将PDF分割成单页文件

发布于 2024-11-30 15:51:49 字数 803 浏览 0 评论 0原文

所以我们使用 heroku 来托管我们的 Rails 应用程序。我们已经搬到雪松堆了。该堆栈没有安装 pdftk 库。我联系了支持人员,并被告知为 amd64 ubuntu 静态编译它并将其包含在我的应用程序中。

事实证明这比我想象的要困难。最初我下载了 ubuntu 的软件包(http://packages.ubuntu.com/natty/pdftk ),提取它,并包含二进制文件以及共享库。我收到奇怪的错误,例如:

Unhandled Java Exception:
java.lang.NullPointerException
   at com.lowagie.text.pdf.PdfCopy.copyIndirect(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyObject(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyDictionary(pdftk)

我假设这是因为某些依赖项未安装?

所以这是我的问题:

  1. 是否有更简单的方法来静态编译库?或者我是否需要移动其二进制文件以及所有库和依赖项?
  2. 我只是想在 ruby​​ 中将多页 PDF 拆分为单页文件。有没有办法在没有 PDFTK 的情况下做到这一点?或者我是否坚持尝试静态编译 PDFTK?

感谢您的帮助,我知道这不是一个简单的问题,但非常感谢您对此问题的帮助。我已经浪费了近 6 个小时试图让这个该死的东西发挥作用。

So we're using heroku to host our rails application. We've moved to the cedar stack. This stack does not have the pdftk library installed. I contacted support and was told to statically compile it for amd64 ubuntu and include it in my application.

This has proved more difficult than I thought. Initially I downloaded the package for ubuntu (http://packages.ubuntu.com/natty/pdftk), extracted it, and included the binary file as well as the shared libraries. I'm getting strange errors like:

Unhandled Java Exception:
java.lang.NullPointerException
   at com.lowagie.text.pdf.PdfCopy.copyIndirect(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyObject(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyDictionary(pdftk)

I'm assuming this is because some of the dependencies aren't installed?

So here are my questions:

  1. Is there an easier way to statically compile a library? Or do I need to move over its binary file as well as all of its libraries and dependencies?
  2. I'm just trying to split a multi-page PDF into single page files in ruby. Is there a way to do this without PDFTK? Or am I stuck with trying to statically compile PDFTK?

Thanks for the help, I know this isn't an easy problem, but would really appreciate help with this one. I've wasted close to 6 hours trying to get this damn thing to work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

栀子花开つ 2024-12-07 15:51:49

不幸的是,Heroku 不断剥离魔法以增加灵活性。结果感觉越来越像我过去管理和维护自己的服务器的日子。没有简单的解决方案。我的“猴子补丁”是将文件发送到我可以安装 PDFTK 的服务器,处理该文件,然后将其发回。不太好,但它有效。必须处理这个问题就违背了使用 Heroku 的目的。

Unfortunately Heroku keeps stripping out magic to add flexibility. As a result it feels more and more like the days when I used to manage and maintain my own servers. There is no easy solution. My "monkey patch" is to send the file to a server that I can install PDFTK, process the file, and send it back. Not great, but it works. Having to deal with this defeats the purpose of using heroku.

混吃等死 2024-12-07 15:51:49

简单的解决方案是添加 Heroku 上找不到的 pdftk 依赖项。

$ldd pdftk
    linux-vdso.so.1 =>  (0x00007ffff43ca000)
    libgcj.so.10 => not found
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1d26d48000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1d26ac4000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1d268ad000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1d2652a000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1d2630c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1d27064000)

我将 pdftk 和 libgcj.so.10 放入我的应用程序的 /bin 目录中。然后你只需要告诉heroku在加载库时查看/bin目录。

您可以键入

$heroku config
LD_LIBRARY_PATH:             /app/.heroku/vendor/lib
LIBRARY_PATH:                /app/.heroku/vendor/lib

To 查看当前 LD_LIBRARY_PATH 设置的内容,然后将 /app/bin (或您选择存储 libgcj.so.10 的任何目录)添加到其中。

$heroku config:set LD_LIBRARY_PATH=/app/.heroku/vendor/lib:/app/bin

缺点是我的 slug 大小从 15.9MB 变成了 27.5MB

The easy solution is to add the one dependency for pdftk that is not found on heroku.

$ldd pdftk
    linux-vdso.so.1 =>  (0x00007ffff43ca000)
    libgcj.so.10 => not found
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1d26d48000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1d26ac4000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1d268ad000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1d2652a000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1d2630c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1d27064000)

I put pdftk and libgcj.so.10 into the /bin directory of my app. You then just need to tell heroku to look at the /bin dir when loading libs.

You can type

$heroku config
LD_LIBRARY_PATH:             /app/.heroku/vendor/lib
LIBRARY_PATH:                /app/.heroku/vendor/lib

To see what your current LD_LIBRARY_PATH is set to and then add /app/bin (or whatever dir you chose to store libgcj.so.10) to it.

$heroku config:set LD_LIBRARY_PATH=/app/.heroku/vendor/lib:/app/bin

The down side is that my slug size went from 15.9MB to 27.5MB

时光与爱终年不遇 2024-12-07 15:51:49

我们遇到了同样的问题,我们想出的解决方案是使用 Stapler 代替 https://github.com/ hellerbarde/stapler,它是一个 python 实用程序,只需要在 Heroku 上安装一个额外的模块 (pyPdf)。

我一直关注这个博客条目: http: //theprogrammingbutler.com/blog/archives/2011/07/28/running-pdftotext-on-heroku/

这是我安装的步骤pyPdf:

访问 heroku bash 控制台

heroku run bash

安装最新版本的 pyPdf

cd tmp
curl http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz -o pyPdf-1.13.tar.gz
tar zxvf pyPdf-1.13.tar.gz
python setup.py install --user

这会将所有必需的文件放在应用程序根目录下的 .local 文件下。我刚刚下载了它并将其添加到我们的 git 存储库以及订书机实用程序中。最后我更新了我的代码以使用订书机而不是 pdftk,等等!再次从 Heroku 拆分 PDF。

另一种可能更干净的方法是将其封装在 gem 中( http://news.ycombinator. com/item?id=2816783

We've encountered the same problem, the solution we came up with was to use Stapler instead https://github.com/hellerbarde/stapler, it's a python utility and only requires an extra module to be installed (pyPdf) on Heroku.

I've been oriented to this blog entry: http://theprogrammingbutler.com/blog/archives/2011/07/28/running-pdftotext-on-heroku/

Here are the steps I followed to install pyPdf:

Accessing the heroku bash console

heroku run bash

Installing the latest version of pyPdf

cd tmp
curl http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz -o pyPdf-1.13.tar.gz
tar zxvf pyPdf-1.13.tar.gz
python setup.py install --user

This puts all the necessary files under a .local file at the root of the app. I just downloaded it and added it to our git repo, as well as the stapler utility. Finally I updated my code to use stapler instead of pdftk, et voilà! Splitting PDFs from Heroku again.

Another way, probably cleaner, would be to encapsulate it in a gem ( http://news.ycombinator.com/item?id=2816783 )

挽梦忆笙歌 2024-12-07 15:51:49

我读了关于SO的类似问题,并发现Ryan 的这种方法Daigle 对我也有用:不要构建难以与 Heroku 服务器匹配的本地二进制文件,而是使用远程环境来编译和构建所需的依赖项。这是使用 Heroku 提供的 Vulcan gem 来完成的。

Ryan 的文章“构建依赖二进制文件”对于 Heroku 应用程序”

Jon Magic 的另一种方法(未经我测试)是直接通过 Heroku 的 bash 下载并编译依赖项,例如直接在服务器:“在 Heroku 上编译可执行文件”

顺便说一句,如果 Heroku 的底层环境发生足够大的变化,这两种方法都会导致二进制文件损坏。

I read a similar question on SO, and found this approach by Ryan Daigle that worked for me as well: instead of building local binaries that are hard to match to Heroku's servers, use the remote environment to compile and build the required dependencies. This is accomplished using the Vulcan gem, which is provided by Heroku.

Ryan's article "Building Dependency Binaries for Heroku Applications"

Another approach by Jon Magic (untested by me), is to download and compile the dependency directly through Heroku's bash, e.g. directly on the server: "Compiling Executables on Heroku".

On a side note, both approaches are going to result in binaries that are going to break if Heroku's underlying environment changes enough.

过气美图社 2024-12-07 15:51:49

尝试

Try prawn.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文