使用 iTextSharp 将 PDF 文件中使用的字体保存到文件中
这几乎是 这个悬而未决的问题,但希望知情人士正在观看并能提供帮助。
我正在寻找一些 .NET 代码将 PDF 中嵌入的字体提取到字体文件的能力。我目前正在使用 iTextSharp,但我对其他 .NET 库持开放态度(例如 PDFBox、PDF CLown 等)。我能够迭代来自 BaseFont.GetDocumentFonts() 的信息,但我不清楚如何将字体流式传输到字体文件。
谢谢,肯尼
This is pretty much a duplicate of this unanswered question, but hopefully someone in the know is watching now and can be helpful.
I'm looking for the ability have some .NET code extract the font embedded in a PDF to a font file. I'm currently using iTextSharp, but I'm open to other .NET libraries (e.g. PDFBox, PDF CLown, etc...). I'm able to iterate the information from BaseFont.GetDocumentFonts(), but I'm not clear on how to stream the font out to a font file.
Thanks, Kenny
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我之前曾做出过回应,但为了向该网站上的主题添加可靠的示例(三个月前我非常需要的东西),我将迭代我最终使用的解决方案。
我下载了 MuPDF 并进入 bin 文件夹,检索文件 mutool.exe。然后我用 C# 中的一个单独的进程来调用它。它会提取 PDF 文件中嵌入的所有字体,并将它们转储到包含 mutool.exe 的文件夹中。然后只需将字体从那里移动到我想要它们的文件夹中即可。
需要注意的是,大多数这些字体都是 CFF 文件,如果您打算使用它们,则需要将它们转换。此外,如前所述,如果这些字体是付费字体,则使用这些字体可能会构成软件盗版。最后,这些字体通常只是子集,不包含完整的字形集 - 仅包含 PDF 中使用的字形。
I contributed a response before, but in the interests of adding solid examples to topics on this site (something I dreadfully needed three months ago) I will iterate through the solution I ended up using.
I downloaded MuPDF and went into the bin folder, retrieving the file mutool.exe. I then call this with a separate process in C#. It runs through pulling all of the fonts embedded in the PDF file and dumps them in the folder containing mutool.exe . Then it was just a matter of moving the fonts from there to the folder I wanted them in.
As a bit of a heads up, most of these fonts are CFF files and you will need to convert them if you plan on using them. Also, as has been stated, using these fonts may constitute software piracy if these fonts are paid fonts. Finally, these fonts are usually only subsets and do not contain the complete glyph set - just the glyphs used in the PDF.
我没有得到答案,但我确实找到了几个基于供应商的解决方案。 pdf-tools.com 的软件 pdfextract.exe 运行得很好。此外,来自 fastpdflibrary.com 的库也运行得很好,并且是我们合作的供应商,到目前为止非常满意。
I didn't get an answer, but I did find several vendor-based solutions. The software from pdf-tools.com, pdfextract.exe works very well. Also the library from quickpdflibrary.com works very well too and is the vender we went with and so far very happy.
@Highmastdon - 获取字体名称实际上非常简单,至少在 iText/iTextSharp 中(pdfBox 也是如此 - 但我现在没有代码),但在 iTextSharp 中你会执行以下操作
: ,大多数库都支持简单地提取字体(无论如何都是名称)。
@Highmastdon - it is actually really simple to get the font names, at least in iText/iTextSharp (pdfBox as well - but I don't have the code around right now) but in iTextSharp you would do the following:
And there it is, most libraries have support written in for a simple extraction of fonts (the names in any case).