iText 内存管理 - PdfReader/Watermarking 负载过多

发布于 2024-11-29 17:34:48 字数 2028 浏览 5 评论 0原文

我正在给文档添加水印,并且我不想将它们完全加载到内存中,因为它们可能非常大。我发现 RandomAccessFileOrArray 可以缓冲读取,它做得很好,但仍然加载了太多我不喜欢的内容。

也就是说,在我加载 5 Mb PDF 文件后,使用的内存增加了 23Mb !当我开始给它加水印时,它又跳了 27Mb !之后,使用的内存逐渐增加,但并不可怕。

这种行为有理由吗?您知道如何定义 PdfReader 或 RandomAccessFileOrArray 或其他内容的缓冲区大小吗?

感谢您的意见。


printMem 方法通过显示空闲-已用-总计来显示内存的状态。

这是我的代码

printMem("Before load");
    PdfReader reader = null;
    try {
        reader = new PdfReader(new RandomAccessFileOrArray(new FileInputStream("C:/TEMP/zip/100258.pdf")),null);
        printMem("After load");
        FileOutputStream out = new FileOutputStream(f);
        PdfStamper stamp = new PdfStamper(reader, out);

        int numPages = reader.getNumberOfPages();
        int page=1;
        BaseFont baseFont = 
            BaseFont.createFont(BaseFont.HELVETICA_BOLDOBLIQUE,
                BaseFont.WINANSI, BaseFont.EMBEDDED);
        float width;
        float height;

        while (page <= numPages) {
            printMem("Page " + page);
            PdfContentByte cb = stamp.getOverContent(page);
            height = reader.getPageSizeWithRotation(page).getHeight() / 2;
            width = reader.getPageSizeWithRotation(page).getWidth() / 2;

            cb.saveState();
            cb.setColorFill(MEDIUM_GRAY);

            // Primary Text
            cb.beginText();
            cb.setFontAndSize(baseFont, PRIMARY_FONT_SIZE);
            cb.showTextAligned(Element.ALIGN_CENTER, "WatermarkText", width,
                    height, TEXT_TILT_ANGLE);
            cb.endText();

            cb.restoreState();
            page++;
        }
        stamp.close();
    } catch(Throwable e) {
        reader = null;
        System.gc();
    }

这是部分输出:

Before load | 1566248160 6615840 1572864000
After load | 1542392472 30471528 1572864000
Page 1 | 1515096880 57767120 1572864000
Page 2 | 1515095992 57768008 1572864000
Page 47 | 1512998840 59865160 1572864000
Page 48 | 1512998840 59865160 1572864000

I'm watermarking documents, and I don't want to have to load them completely to memory, as they can be quite large. I found that RandomAccessFileOrArray that kind of buffers the reading, which it does fine but still loads too much to my liking.

That is, after I load a 5 Mb PDF file, the used memory increases 23Mb ! And when I start watermarking it it jumps another 27Mb ! After that used memory gradually increases, but not horribly.

Is there a reason to such behaviour ? Would you know a way to define the buffer size of the PdfReader or RandomAccessFileOrArray or something else ?

Thanks for your input.


The method printMem shows the status of the memory by showing free - used - total.

Here is my code

printMem("Before load");
    PdfReader reader = null;
    try {
        reader = new PdfReader(new RandomAccessFileOrArray(new FileInputStream("C:/TEMP/zip/100258.pdf")),null);
        printMem("After load");
        FileOutputStream out = new FileOutputStream(f);
        PdfStamper stamp = new PdfStamper(reader, out);

        int numPages = reader.getNumberOfPages();
        int page=1;
        BaseFont baseFont = 
            BaseFont.createFont(BaseFont.HELVETICA_BOLDOBLIQUE,
                BaseFont.WINANSI, BaseFont.EMBEDDED);
        float width;
        float height;

        while (page <= numPages) {
            printMem("Page " + page);
            PdfContentByte cb = stamp.getOverContent(page);
            height = reader.getPageSizeWithRotation(page).getHeight() / 2;
            width = reader.getPageSizeWithRotation(page).getWidth() / 2;

            cb.saveState();
            cb.setColorFill(MEDIUM_GRAY);

            // Primary Text
            cb.beginText();
            cb.setFontAndSize(baseFont, PRIMARY_FONT_SIZE);
            cb.showTextAligned(Element.ALIGN_CENTER, "WatermarkText", width,
                    height, TEXT_TILT_ANGLE);
            cb.endText();

            cb.restoreState();
            page++;
        }
        stamp.close();
    } catch(Throwable e) {
        reader = null;
        System.gc();
    }

And here is the partial output:

Before load | 1566248160 6615840 1572864000
After load | 1542392472 30471528 1572864000
Page 1 | 1515096880 57767120 1572864000
Page 2 | 1515095992 57768008 1572864000
Page 47 | 1512998840 59865160 1572864000
Page 48 | 1512998840 59865160 1572864000

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼泪都笑了 2024-12-06 17:34:48

如果您使用包含文件路径的字符串构造 RandomAccessFileOrArray(例如 new RandomAccessFileOrArray("/path/to/pdf");),则仅会部分读取文档。通过输入流或 URL,整个文档被复制到内部字节数组。

The document is only partially read if you construct the RandomAccessFileOrArray with a String which contains the path to the file (e.g. new RandomAccessFileOrArray("/path/to/pdf");). With an InputStream or a URL the whole document is copied to an internal byte array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文