使用 iTextSharp 将 PDF 文件中使用的字体保存到文件中

发布于 2024-08-18 10:31:17 字数 363 浏览 7 评论 0原文

这几乎是这个悬而未决的问题，但希望知情人士正在观看并能提供帮助。

我正在寻找一些 .NET 代码将 PDF 中嵌入的字体提取到字体文件的能力。我目前正在使用 iTextSharp，但我对其他 .NET 库持开放态度（例如 PDFBox、PDF CLown 等）。我能够迭代来自 BaseFont.GetDocumentFonts() 的信息，但我不清楚如何将字体流式传输到字体文件。

谢谢，肯尼

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

压抑⊿情绪 2024-08-25 10:31:18

我之前曾做出过回应，但为了向该网站上的主题添加可靠的示例（三个月前我非常需要的东西），我将迭代我最终使用的解决方案。

我下载了 MuPDF 并进入 bin 文件夹，检索文件 mutool.exe。然后我用 C# 中的一个单独的进程来调用它。它会提取 PDF 文件中嵌入的所有字体，并将它们转储到包含 mutool.exe 的文件夹中。然后只需将字体从那里移动到我想要它们的文件夹中即可。

        /// <summary>
        /// Extract all fonts from PDF
        /// </summary>
        /// <param name="strPDFName"></param>
        public static void ExtractAll(string strPDFName)
        {
            if (strMUTOOL != null && strFontFinal != null)
            {
                Process p = new Process();
                p.StartInfo.FileName = strMUTOOL;
                p.StartInfo.Arguments = "extract \"" + strPDFName + "\"";
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardError = true;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WorkingDirectory = strMUTOOL.Replace("mutool.exe", "").Trim();

                p.Start();
                p.WaitForExit();

                var standardError = p.StandardError.ReadToEnd();
                var standardOutput = p.StandardOutput.ReadToEnd();
                var exitCode = p.ExitCode;
            }
        }

需要注意的是，大多数这些字体都是 CFF 文件，如果您打算使用它们，则需要将它们转换。此外，如前所述，如果这些字体是付费字体，则使用这些字体可能会构成软件盗版。最后，这些字体通常只是子集，不包含完整的字形集 - 仅包含 PDF 中使用的字形。

I contributed a response before, but in the interests of adding solid examples to topics on this site (something I dreadfully needed three months ago) I will iterate through the solution I ended up using.

I downloaded MuPDF and went into the bin folder, retrieving the file mutool.exe. I then call this with a separate process in C#. It runs through pulling all of the fonts embedded in the PDF file and dumps them in the folder containing mutool.exe . Then it was just a matter of moving the fonts from there to the folder I wanted them in.

        /// <summary>
        /// Extract all fonts from PDF
        /// </summary>
        /// <param name="strPDFName"></param>
        public static void ExtractAll(string strPDFName)
        {
            if (strMUTOOL != null && strFontFinal != null)
            {
                Process p = new Process();
                p.StartInfo.FileName = strMUTOOL;
                p.StartInfo.Arguments = "extract \"" + strPDFName + "\"";
                p.StartInfo.UseShellExecute = false;
                p.StartInfo.RedirectStandardError = true;
                p.StartInfo.RedirectStandardOutput = true;
                p.StartInfo.CreateNoWindow = true;
                p.StartInfo.WorkingDirectory = strMUTOOL.Replace("mutool.exe", "").Trim();

                p.Start();
                p.WaitForExit();

                var standardError = p.StandardError.ReadToEnd();
                var standardOutput = p.StandardOutput.ReadToEnd();
                var exitCode = p.ExitCode;
            }
        }

As a bit of a heads up, most of these fonts are CFF files and you will need to convert them if you plan on using them. Also, as has been stated, using these fonts may constitute software piracy if these fonts are paid fonts. Finally, these fonts are usually only subsets and do not contain the complete glyph set - just the glyphs used in the PDF.

回复收藏 0 原文

云巢 2024-08-25 10:31:18

我没有得到答案，但我确实找到了几个基于供应商的解决方案。 pdf-tools.com 的软件 pdfextract.exe 运行得很好。此外，来自 fastpdflibrary.com 的库也运行得很好，并且是我们合作的供应商，到目前为止非常满意。

回复收藏 0 原文

风苍溪 2024-08-25 10:31:17

@Highmastdon - 获取字体名称实际上非常简单，至少在 iText/iTextSharp 中（pdfBox 也是如此 - 但我现在没有代码），但在 iTextSharp 中你会执行以下操作

PdfReader reader = new PdfReader(strFileName);
List<object[]> strFonts = BaseFont.GetDocumentFonts(reader);

：，大多数库都支持简单地提取字体（无论如何都是名称）。

@Highmastdon - it is actually really simple to get the font names, at least in iText/iTextSharp (pdfBox as well - but I don't have the code around right now) but in iTextSharp you would do the following:

PdfReader reader = new PdfReader(strFileName);
List<object[]> strFonts = BaseFont.GetDocumentFonts(reader);

And there it is, most libraries have support written in for a simple extraction of fonts (the names in any case).

回复收藏 0 原文

~没有更多了~

关于作者

淡水深流

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

使用 iTextSharp 将 PDF 文件中使用的字体保存到文件中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

使用 iTextSharp 将 PDF 文件中使用的字体保存到文件中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。