为 Heroku 静态编译 pdftk。需要将PDF分割成单页文件

发布于 2024-11-30 15:51:49 字数 803 浏览 6 评论 0原文

所以我们使用 heroku 来托管我们的 Rails 应用程序。我们已经搬到雪松堆了。该堆栈没有安装 pdftk 库。我联系了支持人员，并被告知为 amd64 ubuntu 静态编译它并将其包含在我的应用程序中。

事实证明这比我想象的要困难。最初我下载了 ubuntu 的软件包（http://packages.ubuntu.com/natty/pdftk ），提取它，并包含二进制文件以及共享库。我收到奇怪的错误，例如：

Unhandled Java Exception:
java.lang.NullPointerException
   at com.lowagie.text.pdf.PdfCopy.copyIndirect(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyObject(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyDictionary(pdftk)

我假设这是因为某些依赖项未安装？

所以这是我的问题：

是否有更简单的方法来静态编译库？或者我是否需要移动其二进制文件以及所有库和依赖项？
我只是想在 ruby 中将多页 PDF 拆分为单页文件。有没有办法在没有 PDFTK 的情况下做到这一点？或者我是否坚持尝试静态编译 PDFTK？

感谢您的帮助，我知道这不是一个简单的问题，但非常感谢您对此问题的帮助。我已经浪费了近 6 个小时试图让这个该死的东西发挥作用。

原文

So we're using heroku to host our rails application. We've moved to the cedar stack. This stack does not have the pdftk library installed. I contacted support and was told to statically compile it for amd64 ubuntu and include it in my application.

This has proved more difficult than I thought. Initially I downloaded the package for ubuntu (http://packages.ubuntu.com/natty/pdftk), extracted it, and included the binary file as well as the shared libraries. I'm getting strange errors like:

Unhandled Java Exception:
java.lang.NullPointerException
   at com.lowagie.text.pdf.PdfCopy.copyIndirect(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyObject(pdftk)
   at com.lowagie.text.pdf.PdfCopy.copyDictionary(pdftk)

I'm assuming this is because some of the dependencies aren't installed?

So here are my questions:

Is there an easier way to statically compile a library? Or do I need to move over its binary file as well as all of its libraries and dependencies?
I'm just trying to split a multi-page PDF into single page files in ruby. Is there a way to do this without PDFTK? Or am I stuck with trying to statically compile PDFTK?

Thanks for the help, I know this isn't an easy problem, but would really appreciate help with this one. I've wasted close to 6 hours trying to get this damn thing to work.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

栀子花开つ 2024-12-07 15:51:49

不幸的是，Heroku 不断剥离魔法以增加灵活性。结果感觉越来越像我过去管理和维护自己的服务器的日子。没有简单的解决方案。我的“猴子补丁”是将文件发送到我可以安装 PDFTK 的服务器，处理该文件，然后将其发回。不太好，但它有效。必须处理这个问题就违背了使用 Heroku 的目的。

回复收藏 0 原文

混吃等死 2024-12-07 15:51:49

简单的解决方案是添加 Heroku 上找不到的 pdftk 依赖项。

$ldd pdftk
    linux-vdso.so.1 =>  (0x00007ffff43ca000)
    libgcj.so.10 => not found
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1d26d48000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1d26ac4000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1d268ad000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1d2652a000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1d2630c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1d27064000)

我将 pdftk 和 libgcj.so.10 放入我的应用程序的 /bin 目录中。然后你只需要告诉heroku在加载库时查看/bin目录。

您可以键入

$heroku config
LD_LIBRARY_PATH:             /app/.heroku/vendor/lib
LIBRARY_PATH:                /app/.heroku/vendor/lib

To 查看当前 LD_LIBRARY_PATH 设置的内容，然后将 /app/bin （或您选择存储 libgcj.so.10 的任何目录）添加到其中。

$heroku config:set LD_LIBRARY_PATH=/app/.heroku/vendor/lib:/app/bin

缺点是我的 slug 大小从 15.9MB 变成了 27.5MB

The easy solution is to add the one dependency for pdftk that is not found on heroku.

$ldd pdftk
    linux-vdso.so.1 =>  (0x00007ffff43ca000)
    libgcj.so.10 => not found
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1d26d48000)
    libm.so.6 => /lib/libm.so.6 (0x00007f1d26ac4000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1d268ad000)
    libc.so.6 => /lib/libc.so.6 (0x00007f1d2652a000)
    libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1d2630c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1d27064000)

I put pdftk and libgcj.so.10 into the /bin directory of my app. You then just need to tell heroku to look at the /bin dir when loading libs.

You can type

$heroku config
LD_LIBRARY_PATH:             /app/.heroku/vendor/lib
LIBRARY_PATH:                /app/.heroku/vendor/lib

To see what your current LD_LIBRARY_PATH is set to and then add /app/bin (or whatever dir you chose to store libgcj.so.10) to it.

$heroku config:set LD_LIBRARY_PATH=/app/.heroku/vendor/lib:/app/bin

The down side is that my slug size went from 15.9MB to 27.5MB

回复收藏 0 原文

时光与爱终年不遇 2024-12-07 15:51:49

我们遇到了同样的问题，我们想出的解决方案是使用 Stapler 代替 https://github.com/ hellerbarde/stapler，它是一个 python 实用程序，只需要在 Heroku 上安装一个额外的模块 (pyPdf)。

我一直关注这个博客条目： http: //theprogrammingbutler.com/blog/archives/2011/07/28/running-pdftotext-on-heroku/

这是我安装的步骤pyPdf：

访问 heroku bash 控制台

heroku run bash

安装最新版本的 pyPdf

cd tmp
curl http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz -o pyPdf-1.13.tar.gz
tar zxvf pyPdf-1.13.tar.gz
python setup.py install --user

这会将所有必需的文件放在应用程序根目录下的 .local 文件下。我刚刚下载了它并将其添加到我们的 git 存储库以及订书机实用程序中。最后我更新了我的代码以使用订书机而不是 pdftk，等等！再次从 Heroku 拆分 PDF。

另一种可能更干净的方法是将其封装在 gem 中（ http://news.ycombinator. com/item?id=2816783）

We've encountered the same problem, the solution we came up with was to use Stapler instead https://github.com/hellerbarde/stapler, it's a python utility and only requires an extra module to be installed (pyPdf) on Heroku.

I've been oriented to this blog entry: http://theprogrammingbutler.com/blog/archives/2011/07/28/running-pdftotext-on-heroku/

Here are the steps I followed to install pyPdf:

Accessing the heroku bash console

heroku run bash

Installing the latest version of pyPdf

cd tmp
curl http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz -o pyPdf-1.13.tar.gz
tar zxvf pyPdf-1.13.tar.gz
python setup.py install --user

This puts all the necessary files under a .local file at the root of the app. I just downloaded it and added it to our git repo, as well as the stapler utility. Finally I updated my code to use stapler instead of pdftk, et voilà! Splitting PDFs from Heroku again.

Another way, probably cleaner, would be to encapsulate it in a gem ( http://news.ycombinator.com/item?id=2816783 )

回复收藏 0 原文