如何使用 Java iText 检查所有使用的字体是否都嵌入到 PDF 中?

发布于 2024-10-11 05:17:31 字数 129 浏览 1 评论 0原文

如何使用 Java 和 iText 检查 PDF 文件中使用的所有字体是否都嵌入到文件中?我有一些现有的 PDF 文档,我想验证它们是否仅使用嵌入字体。

这需要检查是否未使用 PDF 标准字体以及文件中是否嵌入了其他使用的字体。

How to check that all fonts that are used in a PDF file are embedded in the file with Java and iText? I have some existing PDF documents, and I'd like to validate that they use only embedded fonts.

This would require checking that no PDF standard fonts are used and other used fonts are embedded in the file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

半衬遮猫 2024-10-18 05:17:31

查看 iText in Action 中的 ListUsedFonts 示例。

http://itextpdf.com/examples/iia.php?id=287

看起来这将打印出 pdf 中使用的字体以及它们是否嵌入。

/*
 * This class is part of the book "iText in Action - 2nd Edition"
 * written by Bruno Lowagie (ISBN: 9781935182610)
 * For more info, go to: http://itextpdf.com/examples/
 * This example only works with the AGPL version of iText.
 */

package part4.chapter16;

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Set;
import java.util.TreeSet;

import part3.chapter11.FontTypes;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfDictionary;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfReader;

public class ListUsedFonts {

    /** The resulting PDF file. */
    public static String RESULT
        = "results/part4/chapter16/fonts.txt";

    /**
     * Creates a Set containing information about the fonts in the src PDF file.
     * @param src the path to a PDF file
     * @throws IOException
     */
    public Set<String> listFonts(String src) throws IOException {
        Set<String> set = new TreeSet<String>();
        PdfReader reader = new PdfReader(src);
        PdfDictionary resources;
        for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
            resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
            processResource(set, resources);
        }
        reader.close();
        return set;
    }

    /**
     * Extracts the font names from page or XObject resources.
     * @param set the set with the font names
     * @param resources the resources dictionary
     */
    public static void processResource(Set<String> set, PdfDictionary resource) {
        if (resource == null)
            return;
        PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
        if (xobjects != null) {
            for (PdfName key : xobjects.getKeys()) {
                processResource(set, xobjects.getAsDict(key));
            }
        }
        PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
        if (fonts == null)
            return;
        PdfDictionary font;
        for (PdfName key : fonts.getKeys()) {
            font = fonts.getAsDict(key);
            String name = font.getAsName(PdfName.BASEFONT).toString();
            if (name.length() > 8 && name.charAt(7) == '+') {
                name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
            }
            else {
                name = name.substring(1);
                PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);
                if (desc == null)
                    name += " nofontdescriptor";
                else if (desc.get(PdfName.FONTFILE) != null)
                    name += " (Type 1) embedded";
                else if (desc.get(PdfName.FONTFILE2) != null)
                    name += " (TrueType) embedded";
                else if (desc.get(PdfName.FONTFILE3) != null)
                    name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded";
            }
            set.add(name);
        }
    }

    /**
     * Main method.
     *
     * @param    args    no arguments needed
     * @throws DocumentException 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException, DocumentException {
        new FontTypes().createPdf(FontTypes.RESULT);
        Set<String> set = new ListUsedFonts().listFonts(FontTypes.RESULT);
        PrintWriter out = new PrintWriter(new FileOutputStream(RESULT));
        for (String fontname : set)
            out.println(fontname);
        out.flush();
        out.close();
    }
}

Look at the ListUsedFonts example from iText in Action.

http://itextpdf.com/examples/iia.php?id=287

Looks like this will print out the fonts used in a pdf and if they are embedded.

/*
 * This class is part of the book "iText in Action - 2nd Edition"
 * written by Bruno Lowagie (ISBN: 9781935182610)
 * For more info, go to: http://itextpdf.com/examples/
 * This example only works with the AGPL version of iText.
 */

package part4.chapter16;

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Set;
import java.util.TreeSet;

import part3.chapter11.FontTypes;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfDictionary;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfReader;

public class ListUsedFonts {

    /** The resulting PDF file. */
    public static String RESULT
        = "results/part4/chapter16/fonts.txt";

    /**
     * Creates a Set containing information about the fonts in the src PDF file.
     * @param src the path to a PDF file
     * @throws IOException
     */
    public Set<String> listFonts(String src) throws IOException {
        Set<String> set = new TreeSet<String>();
        PdfReader reader = new PdfReader(src);
        PdfDictionary resources;
        for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
            resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
            processResource(set, resources);
        }
        reader.close();
        return set;
    }

    /**
     * Extracts the font names from page or XObject resources.
     * @param set the set with the font names
     * @param resources the resources dictionary
     */
    public static void processResource(Set<String> set, PdfDictionary resource) {
        if (resource == null)
            return;
        PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
        if (xobjects != null) {
            for (PdfName key : xobjects.getKeys()) {
                processResource(set, xobjects.getAsDict(key));
            }
        }
        PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
        if (fonts == null)
            return;
        PdfDictionary font;
        for (PdfName key : fonts.getKeys()) {
            font = fonts.getAsDict(key);
            String name = font.getAsName(PdfName.BASEFONT).toString();
            if (name.length() > 8 && name.charAt(7) == '+') {
                name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
            }
            else {
                name = name.substring(1);
                PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);
                if (desc == null)
                    name += " nofontdescriptor";
                else if (desc.get(PdfName.FONTFILE) != null)
                    name += " (Type 1) embedded";
                else if (desc.get(PdfName.FONTFILE2) != null)
                    name += " (TrueType) embedded";
                else if (desc.get(PdfName.FONTFILE3) != null)
                    name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded";
            }
            set.add(name);
        }
    }

    /**
     * Main method.
     *
     * @param    args    no arguments needed
     * @throws DocumentException 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException, DocumentException {
        new FontTypes().createPdf(FontTypes.RESULT);
        Set<String> set = new ListUsedFonts().listFonts(FontTypes.RESULT);
        PrintWriter out = new PrintWriter(new FileOutputStream(RESULT));
        for (String fontname : set)
            out.println(fontname);
        out.flush();
        out.close();
    }
}
甚是思念 2024-10-18 05:17:31
/**
 * Creates a set containing information about the not-embedded fonts within the src PDF file.
 * @param src the path to a PDF file
 * @throws IOException
 */
public Set<String> listFonts(String src) throws IOException {
    Set<String> set = new TreeSet<String>();
    PdfReader reader = new PdfReader(src);
    PdfDictionary resources;
    for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
        resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
        processResource(set, resources);
    }
    reader.close();
    return set;
}

/**
 * Finds out if the font is an embedded subset font
 * @param font name
 * @return true if the name denotes an embedded subset font
 */
private boolean isEmbeddedSubset(String name) {
    //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
    return name != null && name.length() > 8 && name.charAt(7) == '+';
}

private void processFont(PdfDictionary font, Set<String> set) {
    String name = font.getAsName(PdfName.BASEFONT).toString();
    if(isEmbeddedSubset(name))
        return;

    PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);

    //nofontdescriptor
    if (desc == null) {
        PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS);

        if (descendant == null) {
            set.add(name.substring(1));             
        }
        else {              
            for (int i = 0; i < descendant.size(); i++) {
                PdfDictionary dic = descendant.getAsDict(i);
                processFont(dic, set);                    
              }             
        }            
    }
    /**
     * (Type 1) embedded
     */
    else if (desc.get(PdfName.FONTFILE) != null)
        ;
    /**
     * (TrueType) embedded 
     */
    else if (desc.get(PdfName.FONTFILE2) != null)
        ;
    /**
     * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" 
     */     
    else if (desc.get(PdfName.FONTFILE3) != null)
        ;
    else {
        set.add(name.substring(1));         
    }
}
/**
 * Extracts the names of the not-embedded fonts from page or XObject resources.
 * @param set the set with the font names
 * @param resources the resources dictionary
 */
public void processResource(Set<String> set, PdfDictionary resource) {
    if (resource == null)
        return;
    PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
    if (xobjects != null) {
        for (PdfName key : xobjects.getKeys()) {
            processResource(set, xobjects.getAsDict(key));
        }
    }
    PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
    if (fonts == null)
        return;
    PdfDictionary font;
    for (PdfName key : fonts.getKeys()) {
        font = fonts.getAsDict(key);                           
        processFont(font, set);
    }
}

上面的代码可用于检索未嵌入给定 PDF 文件中的字体。我改进了 iText in Action 中的代码,以便它也可以处理 Font 的 后代字体 节点。

/**
 * Creates a set containing information about the not-embedded fonts within the src PDF file.
 * @param src the path to a PDF file
 * @throws IOException
 */
public Set<String> listFonts(String src) throws IOException {
    Set<String> set = new TreeSet<String>();
    PdfReader reader = new PdfReader(src);
    PdfDictionary resources;
    for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
        resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
        processResource(set, resources);
    }
    reader.close();
    return set;
}

/**
 * Finds out if the font is an embedded subset font
 * @param font name
 * @return true if the name denotes an embedded subset font
 */
private boolean isEmbeddedSubset(String name) {
    //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
    return name != null && name.length() > 8 && name.charAt(7) == '+';
}

private void processFont(PdfDictionary font, Set<String> set) {
    String name = font.getAsName(PdfName.BASEFONT).toString();
    if(isEmbeddedSubset(name))
        return;

    PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);

    //nofontdescriptor
    if (desc == null) {
        PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS);

        if (descendant == null) {
            set.add(name.substring(1));             
        }
        else {              
            for (int i = 0; i < descendant.size(); i++) {
                PdfDictionary dic = descendant.getAsDict(i);
                processFont(dic, set);                    
              }             
        }            
    }
    /**
     * (Type 1) embedded
     */
    else if (desc.get(PdfName.FONTFILE) != null)
        ;
    /**
     * (TrueType) embedded 
     */
    else if (desc.get(PdfName.FONTFILE2) != null)
        ;
    /**
     * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" 
     */     
    else if (desc.get(PdfName.FONTFILE3) != null)
        ;
    else {
        set.add(name.substring(1));         
    }
}
/**
 * Extracts the names of the not-embedded fonts from page or XObject resources.
 * @param set the set with the font names
 * @param resources the resources dictionary
 */
public void processResource(Set<String> set, PdfDictionary resource) {
    if (resource == null)
        return;
    PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
    if (xobjects != null) {
        for (PdfName key : xobjects.getKeys()) {
            processResource(set, xobjects.getAsDict(key));
        }
    }
    PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
    if (fonts == null)
        return;
    PdfDictionary font;
    for (PdfName key : fonts.getKeys()) {
        font = fonts.getAsDict(key);                           
        processFont(font, set);
    }
}

The code above could be used to retrieve the fonts that are not embedded in the given PDF file. I've improved the code from iText in Action so that it can handle Font's DescendantFont node, too.

陈独秀 2024-10-18 05:17:31

当您创建 chunk 时,您声明您使用的字体。
从您要使用的字体创建 BaseFont 并将其声明为 BaseFont.EMBEDDED。
请注意,当您未将选项 subset 设置为 true 时,将嵌入整个字体。

请注意,嵌入字体可能会侵犯作者权。

When you create Chunk, you declare what font you use.
Create BaseFont from the font you want to use and declare is as BaseFont.EMBEDDED.
Note that when you not set option subset to true, the whole font will be embedded.

Be aware that embedding font might violate authorship rights.

箜明 2024-10-18 05:17:31

我不认为这是一个“iText”用例。使用 PDFBoxjPod。它们实现了 PDF 模型,因此使您能够:

  • 打开文档
  • 从文档根递归到对象树
  • 检查这是否是字体对象
  • 检查字体文件是否可用

检查是否仅使用嵌入字体< /strong> 到目前为止要复杂得多(即,未嵌入但未使用的字体很好)。

I don't think this is an "iText" use case. Use either PDFBox or jPod. These implement the PDF model and as such enable you to:

  • open the document
  • recurse from the document root down the object tree
  • check if this is a font object
  • check if the font file is available

A check if only embedded fonts are used is by far more complex (this is , fonts that are not embedded but not used are fine).

十秒萌定你 2024-10-18 05:17:31

最简单的答案是使用 Adob​​e Acrobat 打开 PDF 文件,然后:

  1. 单击“文件”,
  2. 选择“属性”
  3. ,单击“字体”选项卡

这将显示文档中所有字体的列表。任何嵌入的字体都会在字体名称旁边显示“(嵌入)”。

例如:

ACaslonPro-Bold(嵌入)

,其中 ACaslonPro-Bold 源自嵌入它的文件名(例如 FontFactory.register("/path/to/ACaslonPro-Bold.otf",...< /代码>

The simplest answer, is to open the PDF file with Adobe Acrobat then:

  1. click on File
  2. select Properties
  3. click on the Fonts tab

This will show you a list of all fonts in the document. Any font that is embedded will display "(Embedded)" next to the font name.

For example:

ACaslonPro-Bold (Embedded)

where ACaslonPro-Bold is derived from the file name that you embedded it with (e.g. FontFactory.register("/path/to/ACaslonPro-Bold.otf",...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文