用pdfbox分割pdf,但丢失字体
我使用 pdfbox API 用 Java 编写了一些代码,将 pdf 文档拆分为单独的页面,在页面中查找特定字符串,然后从包含该字符串的页面创建一个新的 pdf。我的问题是,当保存新页面时,我丢失了字体。我刚刚制作了一个快速的 Word 文档来测试它,默认字体是 calibri,所以当我运行该程序时,我收到一个错误框,上面写着:“无法提取嵌入的字体...”因此它用其他默认字体替换了该字体。
我看过很多示例代码,这些代码展示了如何在输入要放置在 pdf 中的文本时更改字体,但没有任何代码可以设置 pdf 的字体。
如果有人熟悉执行此操作的方法(或可以找到文档/示例),我将不胜感激!
编辑:忘记包含一些示例代码,
if (pageContent.indexOf(findThis) >= 0){
PDPage pageToRip = pages.get(i);
>>set the font of pageToRip here
res.importPage(pageToRip); //res is the new document that will be saved
}
我不知道这是否有帮助,但我想我会包含它。
另外,如果 pdf 是用 calibri 和 split 编写的,则更改如下所示:
注意:这可能不是问题,它取决于需要处理的文件中使用的字体。除了 Calibri 之外,我还尝试了一些东西,效果很好。
I wrote some code in Java using the pdfbox API that splits a pdf document into it's individual pages, looks through the pages for a specific string, and then makes a new pdf from the page with the string on it. My problem is that when the new page is saved, I lose my font. I just made a quick word document to test it and the default font was calibri, so when I run the program I get an error box that reads: "Cannot extract the embedded font..." So it replaces the font with some other default.
I have seen a lot of example code that shows how to change the font when you are inputting text to be placed in the pdf, but nothing that sets the font for the pdf.
If anyone is familiar with a way to do this, (or can find documentation/examples), I would greatly appreciate it!
Edit: forgot to include some sample code
if (pageContent.indexOf(findThis) >= 0){
PDPage pageToRip = pages.get(i);
>>set the font of pageToRip here
res.importPage(pageToRip); //res is the new document that will be saved
}
I don't know if that helps any, but I figured I'd include it.
Also, this is what the change looks like if the pdf is written in calibri and split:
Note: This might be a nonissue, it depends on the font used in the files that will need to be processed. I tried some things besides Calibri and it worked out fine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
来自如何从 PDF 中提取字体:
From How to extract fonts from a PDF: