XMP 元数据的自定义架构
我想将自定义元数据写入 XMP 标准架构不支持的 pdf 文件,因此我编写了自己的架构,其中包含我自己的属性。我可以使用 PDFBox 或 iTextPDF 库成功地将这些额外的自定义元数据写入我的 PDF 文件。但是,如果不解析 XMP xml,我无法在客户端读取自定义元数据。
我想应该有一些我不知道的 API 可以将您的自定义模式返回到您的 java 类。
如果我的思考方向正确,或者我实际上是否需要解析 xml 以将自定义数据返回到客户端,请帮助我?
编写的代码
这是我使用 PDFBox 库自定义元数据文件
package com.ecomail.emx.core.xmp;
import java.io.IOException;
import org.apache.jempbox.xmp.XMPMetadata;
public class EMXMetadata extends XMPMetadata {
public EMXMetadata() throws IOException {
super();
}
public EMXSchema addEMXSchema() {
EMXSchema schema = new EMXSchema(this);
return (EMXSchema) basicAddSchema(schema);
}
public EMXSchema getEMXSchema() throws IOException {
return (EMXSchema) getSchemaByClass(EMXSchema.class);
}
}
。自定义架构文件。
package com.ecomail.emx.core.xmp;
import java.util.List;
import org.apache.jempbox.xmp.XMPMetadata;
import org.apache.jempbox.xmp.XMPSchema;
import org.w3c.dom.Element;
public class EMXSchema extends XMPSchema {
public static final String NAMESPACE = "http://www.test.com/emx/elements/1.1/";
public EMXSchema(XMPMetadata parent) {
super(parent, "test", NAMESPACE);
}
public EMXSchema(Element element, String prefix) {
super(element, prefix);
}
public String getMetaDataType() {
return getTextProperty(prefix + ":metaDataType");
}
public void setMetaDataType(String metaDataType) {
setTextProperty(prefix + ":metaDataType", metaDataType);
}
public void removeRecipient(String recipient) {
removeBagValue(prefix + ":recipient", recipient);
}
public void addRecipient(String recipient) {
addBagValue(prefix + ":recipient", recipient);
}
public List<String> getRecipients() {
return getBagList(prefix + ":recipient");
}
}
XML 客户端文件。
package com.ecomail.emx.core.xmp;
import java.util.GregorianCalendar;
import org.apache.jempbox.xmp.XMPMetadata;
import org.apache.jempbox.xmp.XMPSchemaDublinCore;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.pdmodel.common.PDMetadata;
public class XMPClient {
private XMPClient() {
}
public static void main(String[] args) throws Exception {
PDDocument document = null;
try {
document = PDDocument.load("/home/silver/SVNRoot/ecomail/trunk/sample.pdf");
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDDocumentInformation info = document.getDocumentInformation();
EMXMetadata metadata = new EMXMetadata();
XMPSchemaDublinCore dcSchema = metadata.addDublinCoreSchema();
dcSchema.setTitle(info.getTitle());
dcSchema.addContributor("Contributor");
dcSchema.setCoverage("coverage");
dcSchema.addCreator("PDFBox");
dcSchema.addDate(new GregorianCalendar());
dcSchema.setDescription("description");
dcSchema.addLanguage("language");
dcSchema.setCoverage("coverage");
dcSchema.setFormat("format");
EMXSchema emxSchema = metadata.addEMXSchema();
emxSchema.addRecipient("Recipient 1");
emxSchema.addRecipient("Recipient 2");
PDMetadata metadataStream = new PDMetadata(document);
metadataStream.importXMPMetadata(metadata);
catalog.setMetadata(metadataStream);
document.save("/home/silver/SVNRoot/ecomail/trunk/sample1.pdf");
document.close();
document = PDDocument.load("/home/silver/SVNRoot/ecomail/trunk/sample1.pdf");
PDDocumentCatalog catalog2 = document.getDocumentCatalog();
PDMetadata metadataStream2 = catalog2.getMetadata();
XMPMetadata metadata2 = metadataStream2.exportXMPMetadata();
EMXSchema emxSchema2 = (EMXSchema) metadata2.getSchemaByClass(EMXSchema.class);
System.out.println("recipients : " + emxSchema2.getRecipients());
} finally {
if (document != null) {
document.close();
}
}
}
}
在 XMPClient 文件中,我希望通过从类名查询结果元数据来获取 EMXSchema 对象。
XMPMetadata metadata2 = metadataStream2.exportXMPMetadata();
EMXSchema emxSchema2 = (EMXSchema) metadata2.getSchemaByClass(EMXSchema.class);
System.out.println("recipients : " + emxSchema2.getRecipients());
但我收到空指针异常,表明未找到该异常。 如果我做得正确,或者我是否需要解析 XMP 来获取收件人值,任何人都可以帮助我吗?
谢谢
I want to write custom metadata to a pdf file which are not supported by XMP standard schemas hence I wrote my own schema containing my own properties. I can successfully write these additional custom metadata to my PDF file using either PDFBox or iTextPDF library. I am however unable to read the custom metadata at client side without parsing the XMP xml.
I guess there should be some API that I am not aware of for getting your custom schema back to your java class.
Please help me if I am thinking in right direction or do I actually need to parse the xml for getting my custom data back at client side?
Here is the code I wrote using PDFBox library
Custom Metadata File.
package com.ecomail.emx.core.xmp;
import java.io.IOException;
import org.apache.jempbox.xmp.XMPMetadata;
public class EMXMetadata extends XMPMetadata {
public EMXMetadata() throws IOException {
super();
}
public EMXSchema addEMXSchema() {
EMXSchema schema = new EMXSchema(this);
return (EMXSchema) basicAddSchema(schema);
}
public EMXSchema getEMXSchema() throws IOException {
return (EMXSchema) getSchemaByClass(EMXSchema.class);
}
}
Custom Schema File.
package com.ecomail.emx.core.xmp;
import java.util.List;
import org.apache.jempbox.xmp.XMPMetadata;
import org.apache.jempbox.xmp.XMPSchema;
import org.w3c.dom.Element;
public class EMXSchema extends XMPSchema {
public static final String NAMESPACE = "http://www.test.com/emx/elements/1.1/";
public EMXSchema(XMPMetadata parent) {
super(parent, "test", NAMESPACE);
}
public EMXSchema(Element element, String prefix) {
super(element, prefix);
}
public String getMetaDataType() {
return getTextProperty(prefix + ":metaDataType");
}
public void setMetaDataType(String metaDataType) {
setTextProperty(prefix + ":metaDataType", metaDataType);
}
public void removeRecipient(String recipient) {
removeBagValue(prefix + ":recipient", recipient);
}
public void addRecipient(String recipient) {
addBagValue(prefix + ":recipient", recipient);
}
public List<String> getRecipients() {
return getBagList(prefix + ":recipient");
}
}
XML Client File.
package com.ecomail.emx.core.xmp;
import java.util.GregorianCalendar;
import org.apache.jempbox.xmp.XMPMetadata;
import org.apache.jempbox.xmp.XMPSchemaDublinCore;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.pdmodel.common.PDMetadata;
public class XMPClient {
private XMPClient() {
}
public static void main(String[] args) throws Exception {
PDDocument document = null;
try {
document = PDDocument.load("/home/silver/SVNRoot/ecomail/trunk/sample.pdf");
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDDocumentInformation info = document.getDocumentInformation();
EMXMetadata metadata = new EMXMetadata();
XMPSchemaDublinCore dcSchema = metadata.addDublinCoreSchema();
dcSchema.setTitle(info.getTitle());
dcSchema.addContributor("Contributor");
dcSchema.setCoverage("coverage");
dcSchema.addCreator("PDFBox");
dcSchema.addDate(new GregorianCalendar());
dcSchema.setDescription("description");
dcSchema.addLanguage("language");
dcSchema.setCoverage("coverage");
dcSchema.setFormat("format");
EMXSchema emxSchema = metadata.addEMXSchema();
emxSchema.addRecipient("Recipient 1");
emxSchema.addRecipient("Recipient 2");
PDMetadata metadataStream = new PDMetadata(document);
metadataStream.importXMPMetadata(metadata);
catalog.setMetadata(metadataStream);
document.save("/home/silver/SVNRoot/ecomail/trunk/sample1.pdf");
document.close();
document = PDDocument.load("/home/silver/SVNRoot/ecomail/trunk/sample1.pdf");
PDDocumentCatalog catalog2 = document.getDocumentCatalog();
PDMetadata metadataStream2 = catalog2.getMetadata();
XMPMetadata metadata2 = metadataStream2.exportXMPMetadata();
EMXSchema emxSchema2 = (EMXSchema) metadata2.getSchemaByClass(EMXSchema.class);
System.out.println("recipients : " + emxSchema2.getRecipients());
} finally {
if (document != null) {
document.close();
}
}
}
}
In the XMPClient file I am expecting that I will get EMXSchema object back from the resulatant meta data by querying it from its class name.
XMPMetadata metadata2 = metadataStream2.exportXMPMetadata();
EMXSchema emxSchema2 = (EMXSchema) metadata2.getSchemaByClass(EMXSchema.class);
System.out.println("recipients : " + emxSchema2.getRecipients());
But I am getting Null Pointer Exception indicating this was not found.
Can anybody please help me if I am doing it right way or do I need to parse the XMP to get my recipients values.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
最后我自己开始工作了。
解决方案是使用 XMPMetadata 类的另一个构造函数来接受预定义的文档类。
现在我的自定义 emxMetadata 包含非空 emxSchema2 对象,我可以从中取回我的收件人对象。然而,为了使其工作,我必须修改 EMXMetadata 以支持您的架构类的 XMLNamespaceMapping
}
Finally I got it working myself.
The solution is to use the another constructor of the XMPMetadata class that accepts a predefined document class.
Now my custom emxMetadata contains non null emxSchema2 object and I can get back my recipient objects from it. However to make it work I had to modify EMXMetadata to support XMLNamespaceMapping for your schema class
}