使用 iTextSharp 时分割的 PDF 尺寸较大

发布于 2024-10-30 02:13:43 字数 164 浏览 3 评论 0原文

亲爱的团队，在我的应用程序中，我想使用 itextsharp 分割 pdf。如果我上传的 PDF 包含 10 页，分割文件大小为 10 mb，分割后每个 pdf 的合并文件大小将导致文件大小超过 20mb。如果可能的话，减少文件大小（每个pdf）。

请帮我解决这个问题。

提前致谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橘寄 2024-11-06 02:13:43

这可能与文件中的资源有关。例如，如果原始文档在每个文档上都使用嵌入字体，则原始文件中将只有该字体的一个实例。当您拆分它时，每个文件也需要具有该字体。总开销为 n 页 × sizeof（每种字体）。会导致这种膨胀的元素包括字体、图像、颜色配置文件、文档模板（又名表单）、XMP 等。

虽然它不能帮助您解决眼前的问题，但如果您使用 Atalasoft dotImage，您的任务变成了 1 个衬垫：

PdfDocument.Separate(userpassword, ownerpassword, origPath, destFolder, "Separated Page{0}.pdf", true);

它将获取 orig 文件中的 PDF 并在 dest 文件夹中创建新页面，每个页面命名为与图案。最后的布尔值是覆盖现有文件。

免责声明：我在 Atalasoft 工作并编写了 PDF 库（也曾在 Adobe 工作过 Acrobat 版本 1、2、3 和 4）。

This may have to do with the resources in the file. If the original document uses an embedded font on each, for example, then there will only be one instance of the font in the original file. When you split it, each file will be required have that font as well. The total overhead will be n pages × sizeof(each font). Elements that will cause this kind of bloat include fonts, images, color profiles, document templates (aka forms), XMP, etc.

And while it doesn't help you in your immediate problem, if you use the PDF tools in Atalasoft dotImage, your task becomes a 1 liner:

PdfDocument.Separate(userpassword, ownerpassword, origPath, destFolder, "Separated Page{0}.pdf", true);

which will take the PDF in orig file and create new pages in the dest folder each named with the pattern. The bool at the end is to overwrite an existing file.

Disclaimer: I work for Atalasoft and wrote the PDF library (also used to work at Adobe on Acrobat versions 1, 2, 3, and 4).

回复收藏 0 原文

随遇而安 2024-11-06 02:13:43

大家好，我修改了上面的代码，将一个 PDF 文件拆分为多个 Pdf 文件。

        iTextSharp.text.pdf.PdfReader reader = null;
        int currentPage = 1;
        int pageCount = 0;
        //string filepath_New = filepath + "\\PDFDestination\\";

        System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
        //byte[] arrayofPassword = encoding.GetBytes(ExistingFilePassword);
        reader = new iTextSharp.text.pdf.PdfReader(filepath);
        reader.RemoveUnusedObjects();
        pageCount = reader.NumberOfPages;
        string ext = System.IO.Path.GetExtension(filepath);
        for (int i = 1; i <= pageCount; i++)
        {
            iTextSharp.text.pdf.PdfReader reader1 = new iTextSharp.text.pdf.PdfReader(filepath);
            string outfile = filepath.Replace((System.IO.Path.GetFileName(filepath)), (System.IO.Path.GetFileName(filepath).Replace(".pdf", "") + "_" + i.ToString()) + ext);
            reader1.RemoveUnusedObjects();
            iTextSharp.text.Document doc = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(currentPage));
            iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(doc, new System.IO.FileStream(outfile, System.IO.FileMode.Create));
            doc.Open();
            for (int j = 1; j <= 1; j++)
            {
                iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader1, currentPage);
                pdfCpy.SetFullCompression();
                pdfCpy.AddPage(page);
                currentPage += 1;
            }
            doc.Close();
            pdfCpy.Close();
            reader1.Close();
            reader.Close();

        }

Hi Guys i modified the above code to split a PDF file into multiple Pdf file.

        iTextSharp.text.pdf.PdfReader reader = null;
        int currentPage = 1;
        int pageCount = 0;
        //string filepath_New = filepath + "\\PDFDestination\\";

        System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
        //byte[] arrayofPassword = encoding.GetBytes(ExistingFilePassword);
        reader = new iTextSharp.text.pdf.PdfReader(filepath);
        reader.RemoveUnusedObjects();
        pageCount = reader.NumberOfPages;
        string ext = System.IO.Path.GetExtension(filepath);
        for (int i = 1; i <= pageCount; i++)
        {
            iTextSharp.text.pdf.PdfReader reader1 = new iTextSharp.text.pdf.PdfReader(filepath);
            string outfile = filepath.Replace((System.IO.Path.GetFileName(filepath)), (System.IO.Path.GetFileName(filepath).Replace(".pdf", "") + "_" + i.ToString()) + ext);
            reader1.RemoveUnusedObjects();
            iTextSharp.text.Document doc = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(currentPage));
            iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(doc, new System.IO.FileStream(outfile, System.IO.FileMode.Create));
            doc.Open();
            for (int j = 1; j <= 1; j++)
            {
                iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader1, currentPage);
                pdfCpy.SetFullCompression();
                pdfCpy.AddPage(page);
                currentPage += 1;
            }
            doc.Close();
            pdfCpy.Close();
            reader1.Close();
            reader.Close();

        }

回复收藏 0 原文

血之狂魔 2024-11-06 02:13:43

您是否尝试过在写入器上设置压缩？

Document doc = new Document();
    using (MemoryStream ms = new MemoryStream())  
    {
        PdfWriter writer = PdfWriter.GetInstance(doc, ms);
        writer.SetFullCompression();
    }

Have you tried setting the compression on the writer?

Document doc = new Document();
    using (MemoryStream ms = new MemoryStream())  
    {
        PdfWriter writer = PdfWriter.GetInstance(doc, ms);
        writer.SetFullCompression();
    }

回复收藏 0 原文

~没有更多了~