iText 内存管理 - PdfReader/Watermarking 负载过多
我正在给文档添加水印,并且我不想将它们完全加载到内存中,因为它们可能非常大。我发现 RandomAccessFileOrArray 可以缓冲读取,它做得很好,但仍然加载了太多我不喜欢的内容。
也就是说,在我加载 5 Mb PDF 文件后,使用的内存增加了 23Mb !当我开始给它加水印时,它又跳了 27Mb !之后,使用的内存逐渐增加,但并不可怕。
这种行为有理由吗?您知道如何定义 PdfReader 或 RandomAccessFileOrArray 或其他内容的缓冲区大小吗?
感谢您的意见。
printMem 方法通过显示空闲-已用-总计来显示内存的状态。
这是我的代码
printMem("Before load");
PdfReader reader = null;
try {
reader = new PdfReader(new RandomAccessFileOrArray(new FileInputStream("C:/TEMP/zip/100258.pdf")),null);
printMem("After load");
FileOutputStream out = new FileOutputStream(f);
PdfStamper stamp = new PdfStamper(reader, out);
int numPages = reader.getNumberOfPages();
int page=1;
BaseFont baseFont =
BaseFont.createFont(BaseFont.HELVETICA_BOLDOBLIQUE,
BaseFont.WINANSI, BaseFont.EMBEDDED);
float width;
float height;
while (page <= numPages) {
printMem("Page " + page);
PdfContentByte cb = stamp.getOverContent(page);
height = reader.getPageSizeWithRotation(page).getHeight() / 2;
width = reader.getPageSizeWithRotation(page).getWidth() / 2;
cb.saveState();
cb.setColorFill(MEDIUM_GRAY);
// Primary Text
cb.beginText();
cb.setFontAndSize(baseFont, PRIMARY_FONT_SIZE);
cb.showTextAligned(Element.ALIGN_CENTER, "WatermarkText", width,
height, TEXT_TILT_ANGLE);
cb.endText();
cb.restoreState();
page++;
}
stamp.close();
} catch(Throwable e) {
reader = null;
System.gc();
}
这是部分输出:
Before load | 1566248160 6615840 1572864000
After load | 1542392472 30471528 1572864000
Page 1 | 1515096880 57767120 1572864000
Page 2 | 1515095992 57768008 1572864000
Page 47 | 1512998840 59865160 1572864000
Page 48 | 1512998840 59865160 1572864000
I'm watermarking documents, and I don't want to have to load them completely to memory, as they can be quite large. I found that RandomAccessFileOrArray that kind of buffers the reading, which it does fine but still loads too much to my liking.
That is, after I load a 5 Mb PDF file, the used memory increases 23Mb ! And when I start watermarking it it jumps another 27Mb ! After that used memory gradually increases, but not horribly.
Is there a reason to such behaviour ? Would you know a way to define the buffer size of the PdfReader or RandomAccessFileOrArray or something else ?
Thanks for your input.
The method printMem shows the status of the memory by showing free - used - total.
Here is my code
printMem("Before load");
PdfReader reader = null;
try {
reader = new PdfReader(new RandomAccessFileOrArray(new FileInputStream("C:/TEMP/zip/100258.pdf")),null);
printMem("After load");
FileOutputStream out = new FileOutputStream(f);
PdfStamper stamp = new PdfStamper(reader, out);
int numPages = reader.getNumberOfPages();
int page=1;
BaseFont baseFont =
BaseFont.createFont(BaseFont.HELVETICA_BOLDOBLIQUE,
BaseFont.WINANSI, BaseFont.EMBEDDED);
float width;
float height;
while (page <= numPages) {
printMem("Page " + page);
PdfContentByte cb = stamp.getOverContent(page);
height = reader.getPageSizeWithRotation(page).getHeight() / 2;
width = reader.getPageSizeWithRotation(page).getWidth() / 2;
cb.saveState();
cb.setColorFill(MEDIUM_GRAY);
// Primary Text
cb.beginText();
cb.setFontAndSize(baseFont, PRIMARY_FONT_SIZE);
cb.showTextAligned(Element.ALIGN_CENTER, "WatermarkText", width,
height, TEXT_TILT_ANGLE);
cb.endText();
cb.restoreState();
page++;
}
stamp.close();
} catch(Throwable e) {
reader = null;
System.gc();
}
And here is the partial output:
Before load | 1566248160 6615840 1572864000
After load | 1542392472 30471528 1572864000
Page 1 | 1515096880 57767120 1572864000
Page 2 | 1515095992 57768008 1572864000
Page 47 | 1512998840 59865160 1572864000
Page 48 | 1512998840 59865160 1572864000
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您使用包含文件路径的字符串构造 RandomAccessFileOrArray(例如 new RandomAccessFileOrArray("/path/to/pdf");),则仅会部分读取文档。通过输入流或 URL,整个文档被复制到内部字节数组。
The document is only partially read if you construct the RandomAccessFileOrArray with a String which contains the path to the file (e.g. new RandomAccessFileOrArray("/path/to/pdf");). With an InputStream or a URL the whole document is copied to an internal byte array.