在代码中检测 PDF 包或包

发布于 2024-11-05 20:38:58 字数 434 浏览 4 评论 0原文

有谁知道如何检测给定的 PDF 文件是 PDF 包还是 PDF 包，而不是“常规”PDF？我更喜欢 Java 解决方案，尽管由于我还没有找到任何关于检测 PDF 特定类型的信息，我会利用我能得到的信息，然后他们会尝试找出 Java 解决方案。

（在搜索过去的问题时，似乎很多人不知道诸如 PDF 包和 PDF 包之类的东西存在。通常，它们都是 Adobe 允许将多个离散 PDF 打包到单个 PDF 文件中的方法在 Reader 中打开 PDF 包会向用户显示嵌入 PDF 的列表，并允许从那里进一步查看 PDF 包似乎有点复杂 - 它们还包括用于嵌入文件的基于 Flash 的浏览器，然后允许用户。从那里提取离散的 PDF 文件是我的问题，也是我希望能够在代码中检测它们的原因，因为 OS X 的内置 Preview.app 无法读取这些文件 - 所以我'我想至少警告我的网络应用程序的用户，上传它们可能会导致跨平台兼容性降低。）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

墨小墨 2024-11-12 20:38:59

I'm also facing same problem while extracting data through kofax,  but i got solution and its working fine need to add extra jar for Document class.

import java.io.File;
import java.io.IOException;
import java.io.InputStream;

public class PDFPortfolio {

    /**
     * @param args
     */
    public static void main(String[] args) {

        com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("e:/pqr1.pdf");
        // get collection of embedded files
        com.aspose.pdf.EmbeddedFileCollection embeddedFiles = pdfDocument.getEmbeddedFiles();
        // iterate through individual file of Portfolio
        for(int counter=1; counter<=pdfDocument.getEmbeddedFiles().size();counter++)
        {
            com.aspose.pdf.FileSpecification fileSpecification = embeddedFiles.get_Item(counter);
            try {
                InputStream input = fileSpecification.getContents();
                File file = new File(fileSpecification.getName());
                // create path for file from pdf
              //  file.getParentFile().mkdirs();
                // create and extract file from pdf
                java.io.FileOutputStream output = new java.io.FileOutputStream("e:/"+fileSpecification.getName(), true);
                byte[] buffer = new byte[4096];
                int n = 0;
                while (-1 != (n = input.read(buffer)))
                output.write(buffer, 0, n);

                // close InputStream object
                input.close();
                output.close();
                } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }

}

I'm also facing same problem while extracting data through kofax,  but i got solution and its working fine need to add extra jar for Document class.

import java.io.File;
import java.io.IOException;
import java.io.InputStream;

public class PDFPortfolio {

    /**
     * @param args
     */
    public static void main(String[] args) {

        com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("e:/pqr1.pdf");
        // get collection of embedded files
        com.aspose.pdf.EmbeddedFileCollection embeddedFiles = pdfDocument.getEmbeddedFiles();
        // iterate through individual file of Portfolio
        for(int counter=1; counter<=pdfDocument.getEmbeddedFiles().size();counter++)
        {
            com.aspose.pdf.FileSpecification fileSpecification = embeddedFiles.get_Item(counter);
            try {
                InputStream input = fileSpecification.getContents();
                File file = new File(fileSpecification.getName());
                // create path for file from pdf
              //  file.getParentFile().mkdirs();
                // create and extract file from pdf
                java.io.FileOutputStream output = new java.io.FileOutputStream("e:/"+fileSpecification.getName(), true);
                byte[] buffer = new byte[4096];
                int n = 0;
                while (-1 != (n = input.read(buffer)))
                output.write(buffer, 0, n);

                // close InputStream object
                input.close();
                output.close();
                } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }

}

回复收藏 0 原文

怪异←思 2024-11-12 20:38:58

这个问题很老了，但如果有人想知道，这是可能的。可以使用以下命令通过 Acrobat 和 JavaScript 来完成此操作。

 if (Doc.collection() != null)
 {
     //It Is Portfolio
 }

Acrobat JavaScript API 说：“集合对象是从 Doc.collection 属性获取的。当没有 PDF 集合（也称为 PDF 包和 PDF 组合）时，Doc.collection 返回空值。集合对象用于设置初始值。”集合中的文档，设置集合的初始视图，以及获取、添加和删除集合字段（或类别）。”

This question is old, but in-case someone wants to know, it is possible. It can be done with Acrobat and JavaScript by using the following command.

 if (Doc.collection() != null)
 {
     //It Is Portfolio
 }

Acrobat JavaScript API says, "A collection object is obtained from the Doc.collection property. Doc.collection returns a null value when there is no PDF collection (also called PDF package and PDF portfolio).The collection object is used to set the initial document in the collection, set the initial view of the collection, and to get, add, and remove collection fields (or categories)."

回复收藏 0 原文

~没有更多了~