使用 iTextSharp 将 PDF 文件中使用的字体保存到文件中

发布于 2024-08-18 10:31:17 字数 363 浏览 7 评论 0原文

这几乎是 这个悬而未决的问题,但希望知情人士正在观看并能提供帮助。

我正在寻找一些 .NET 代码将 PDF 中嵌入的字体提取到字体文件的能力。我目前正在使用 iTextSharp,但我对其他 .NET 库持开放态度(例如 PDFBox、PDF CLown 等)。我能够迭代来自 BaseFont.GetDocumentFonts() 的信息,但我不清楚如何将字体流式传输到字体文件。

谢谢,肯尼

This is pretty much a duplicate of this unanswered question, but hopefully someone in the know is watching now and can be helpful.

I'm looking for the ability have some .NET code extract the font embedded in a PDF to a font file. I'm currently using iTextSharp, but I'm open to other .NET libraries (e.g. PDFBox, PDF CLown, etc...). I'm able to iterate the information from BaseFont.GetDocumentFonts(), but I'm not clear on how to stream the font out to a font file.

Thanks, Kenny

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

压抑⊿情绪 2024-08-25 10:31:18

我之前曾做出过回应,但为了向该网站上的主题添加可靠的示例(三个月前我非常需要的东西),我将迭代我最终使用的解决方案。

我下载了 MuPDF 并进入 bin 文件夹,检索文件 mutool.exe。然后我用 C# 中的一个单独的进程来调用它。它会提取 PDF 文件中嵌入的所有字体,并将它们转储到包含 mutool.exe 的文件夹中。然后只需将字体从那里移动到我想要它们的文件夹中即可。

        /// <summary>
        /// Extract all fonts from PDF
        /// </summary>
        /// <param name="strPDFName"></param>
        public static void ExtractAll(string strPDFName)
        {
            if (strMUTOOL != null && strFontFinal != null)
            {
                Process p = new Process();
                p.StartInfo.FileName = strMUTOOL;
                p.StartInfo.Arguments = "extract \"" + strPDFName + "\"";
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardError = true;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WorkingDirectory = strMUTOOL.Replace("mutool.exe", "").Trim();

                p.Start();
                p.WaitForExit();

                var standardError = p.StandardError.ReadToEnd();
                var standardOutput = p.StandardOutput.ReadToEnd();
                var exitCode = p.ExitCode;
            }
        }

需要注意的是,大多数这些字体都是 CFF 文件,如果您打算使用它们,则需要将它们转换。此外,如前所述,如果这些字体是付费字体,则使用这些字体可能会构成软件盗版。最后,这些字体通常只是子集,不包含完整的字形集 - 仅包含 PDF 中使用的字形。

I contributed a response before, but in the interests of adding solid examples to topics on this site (something I dreadfully needed three months ago) I will iterate through the solution I ended up using.

I downloaded MuPDF and went into the bin folder, retrieving the file mutool.exe. I then call this with a separate process in C#. It runs through pulling all of the fonts embedded in the PDF file and dumps them in the folder containing mutool.exe . Then it was just a matter of moving the fonts from there to the folder I wanted them in.

        /// <summary>
        /// Extract all fonts from PDF
        /// </summary>
        /// <param name="strPDFName"></param>
        public static void ExtractAll(string strPDFName)
        {
            if (strMUTOOL != null && strFontFinal != null)
            {
                Process p = new Process();
                p.StartInfo.FileName = strMUTOOL;
                p.StartInfo.Arguments = "extract \"" + strPDFName + "\"";
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardError = true;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WorkingDirectory = strMUTOOL.Replace("mutool.exe", "").Trim();

                p.Start();
                p.WaitForExit();

                var standardError = p.StandardError.ReadToEnd();
                var standardOutput = p.StandardOutput.ReadToEnd();
                var exitCode = p.ExitCode;
            }
        }

As a bit of a heads up, most of these fonts are CFF files and you will need to convert them if you plan on using them. Also, as has been stated, using these fonts may constitute software piracy if these fonts are paid fonts. Finally, these fonts are usually only subsets and do not contain the complete glyph set - just the glyphs used in the PDF.

云巢 2024-08-25 10:31:18

我没有得到答案,但我确实找到了几个基于供应商的解决方案。 pdf-tools.com 的软件 pdfextract.exe 运行得很好。此外,来自 fastpdflibrary.com 的库也运行得很好,并且是我们合作的供应商,到目前为止非常满意。

I didn't get an answer, but I did find several vendor-based solutions. The software from pdf-tools.com, pdfextract.exe works very well. Also the library from quickpdflibrary.com works very well too and is the vender we went with and so far very happy.

风苍溪 2024-08-25 10:31:17

@Highmastdon - 获取字体名称实际上非常简单,至少在 iText/iTextSharp 中(pdfBox 也是如此 - 但我现在没有代码),但在 iTextSharp 中你会执行以下操作

PdfReader reader = new PdfReader(strFileName);
List<object[]> strFonts = BaseFont.GetDocumentFonts(reader);

: ,大多数库都支持简单地提取字体(无论如何都是名称)。

@Highmastdon - it is actually really simple to get the font names, at least in iText/iTextSharp (pdfBox as well - but I don't have the code around right now) but in iTextSharp you would do the following:

PdfReader reader = new PdfReader(strFileName);
List<object[]> strFonts = BaseFont.GetDocumentFonts(reader);

And there it is, most libraries have support written in for a simple extraction of fonts (the names in any case).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文