将 XML 转换为 JSON 格式

发布于 2024-10-19 16:33:16 字数 66 浏览 6 评论 0原文

我必须将 docx 文件格式(openXML 格式)转换为 JSON 格式。我需要一些指导方针来做到这一点。提前致谢。

I have to convert docx file format (which is in openXML format) into JSON format. I need some guidelines to do it. Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

蓝戈者 2024-10-26 16:33:16

您可以查看 Json-lib Java 库,它提供了 XML 到 - JSON 转换。

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read( xml );  

如果您也需要根标签,只需添加一个外部虚拟标签:

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read("<x>" + xml + "</x>");  

You may take a look at the Json-lib Java library, that provides XML-to-JSON conversion.

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read( xml );  

If you need the root tag too, simply add an outer dummy tag:

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read("<x>" + xml + "</x>");  
雨落星ぅ辰 2024-10-26 16:33:16

XML 和 JSON 之间没有直接映射; XML 带有类型信息(每个元素都有一个名称)以及命名空间。因此,除非每个 JSON 对象都嵌入了类型信息,否则转换将会是有损的。

但这并不一定重要。重要的是 JSON 的使用者知道数据契约。例如,给定此 XML:

<books>
  <book author="Jimbo Jones" title="Bar Baz">
    <summary>Foo</summary>
  </book>
  <book title="Don't Care" author="Fake Person">
    <summary>Dummy Data</summary>
  </book>
</books>

您可以将其转换为:

{
    "books": [
        { "author": "Jimbo Jones", "title": "Bar Baz", "summary": "Foo" },
        { "author": "Fake Person", "title": "Don't Care", "summary": "Dummy Data" },
    ]
}

并且使用者不需要知道 books 集合中的每个对象都是 book 对象。

编辑

如果您有 XML 的 XML 架构并且正在使用 .NET,则可以使用 xsd.exe 从该架构生成类。然后,您可以将源 XML 解析为这些类的对象,然后使用 DataContractJsonSerializer 将类序列化为 JSON。

如果您没有架构,则很难自行手动定义 JSON 格式。

There is no direct mapping between XML and JSON; XML carries with it type information (each element has a name) as well as namespacing. Therefore, unless each JSON object has type information embedded, the conversion is going to be lossy.

But that doesn't necessarily matter. What does matter is that the consumer of the JSON knows the data contract. For example, given this XML:

<books>
  <book author="Jimbo Jones" title="Bar Baz">
    <summary>Foo</summary>
  </book>
  <book title="Don't Care" author="Fake Person">
    <summary>Dummy Data</summary>
  </book>
</books>

You could convert it to this:

{
    "books": [
        { "author": "Jimbo Jones", "title": "Bar Baz", "summary": "Foo" },
        { "author": "Fake Person", "title": "Don't Care", "summary": "Dummy Data" },
    ]
}

And the consumer wouldn't need to know that each object in the books collection was a book object.

Edit:

If you have an XML Schema for the XML and are using .NET, you can generate classes from the schema using xsd.exe. Then, you could parse the source XML into objects of these classes, then use a DataContractJsonSerializer to serialize the classes as JSON.

If you don't have a schema, it will be hard getting around manually defining your JSON format yourself.

七度光 2024-10-26 16:33:16

org.json 命名空间中的 XML 类 为您提供了此功能。

您必须调用静态 toJSONObject 方法< /a>

将格式良好(但不一定有效)的 XML 字符串转换为 JSONObject。由于 JSON 是一种数据格式,而 XML 是一种文档格式,因此在此转换中可能会丢失一些信息。 XML 使用元素、属性和内容文本,而 JSON 使用无序的名称/值对和值数组的集合。 JSON 不喜欢区分元素和属性。相似元素的序列表示为 JSONArray。内容文本可以放置在“内容”成员中。注释、序言、DTD 和 <[ [ ]]>被忽略。

The XML class in the org.json namespace provides you with this functionality.

You have to call the static toJSONObject method

Converts a well-formed (but not necessarily valid) XML string into a JSONObject. Some information may be lost in this transformation because JSON is a data format and XML is a document format. XML uses elements, attributes, and content text, while JSON uses unordered collections of name/value pairs and arrays of values. JSON does not does not like to distinguish between elements and attributes. Sequences of similar elements are represented as JSONArrays. Content text may be placed in a "content" member. Comments, prologs, DTDs, and <[ [ ]]> are ignored.

梦里泪两行 2024-10-26 16:33:16

如果您对各种实现不满意,请尝试推出自己的实现。这是我今天下午编写的一些代码,以帮助您入门。它适用于 net.sf.json 和 apache common-lang:

static public JSONObject readToJSON(InputStream stream) throws Exception {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    factory.setNamespaceAware(true);
    SAXParser parser = factory.newSAXParser();
    SAXJsonParser handler = new SAXJsonParser();
    parser.parse(stream, handler);
    return handler.getJson();
}

以及 SAXJsonParser 实现:

package xml2json;

import net.sf.json.*;
import org.apache.commons.lang.StringUtils;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import java.util.ArrayList;
import java.util.List;

public class SAXJsonParser extends DefaultHandler {

    static final String TEXTKEY = "_text";

    JSONObject result;
    List<JSONObject> stack;

    public SAXJsonParser(){}
    public JSONObject getJson(){return result;}
    public String attributeName(String name){return "@"+name;}

    public void startDocument () throws SAXException {
        stack = new ArrayList<JSONObject>();
        stack.add(0,new JSONObject());
    }
    public void endDocument () throws SAXException {result = stack.remove(0);}
    public void startElement (String uri, String localName,String qName, Attributes attributes) throws SAXException {
        JSONObject work = new JSONObject();
        for (int ix=0;ix<attributes.getLength();ix++)
            work.put( attributeName( attributes.getLocalName(ix) ), attributes.getValue(ix) );
        stack.add(0,work);
    }
    public void endElement (String uri, String localName, String qName) throws SAXException {
        JSONObject pop = stack.remove(0);       // examine stack
        Object stashable = pop;
        if (pop.containsKey(TEXTKEY)) {
            String value = pop.getString(TEXTKEY).trim();
            if (pop.keySet().size()==1) stashable = value; // single value
            else if (StringUtils.isBlank(value)) pop.remove(TEXTKEY);
        }
        JSONObject parent = stack.get(0);
        if (!parent.containsKey(localName)) {   // add new object
            parent.put( localName, stashable );
        }
        else {                                  // aggregate into arrays
            Object work = parent.get(localName);
            if (work instanceof JSONArray) {
                ((JSONArray)work).add(stashable);
            }
            else {
                parent.put(localName,new JSONArray());
                parent.getJSONArray(localName).add(work);
                parent.getJSONArray(localName).add(stashable);
            }
        }
    }
    public void characters (char ch[], int start, int length) throws SAXException {
        JSONObject work = stack.get(0);            // aggregate characters
        String value = (work.containsKey(TEXTKEY) ? work.getString(TEXTKEY) : "" );
        work.put(TEXTKEY, value+new String(ch,start,length) );
    }
    public void warning (SAXParseException e) throws SAXException {
        System.out.println("warning  e=" + e.getMessage());
    }
    public void error (SAXParseException e) throws SAXException {
        System.err.println("error  e=" + e.getMessage());
    }
    public void fatalError (SAXParseException e) throws SAXException {
        System.err.println("fatalError  e=" + e.getMessage());
        throw e;
    }
}

If you are dissatisfied with the various implementations, try rolling your own. Here is some code I wrote this afternoon to get you started. It works with net.sf.json and apache common-lang:

static public JSONObject readToJSON(InputStream stream) throws Exception {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    factory.setNamespaceAware(true);
    SAXParser parser = factory.newSAXParser();
    SAXJsonParser handler = new SAXJsonParser();
    parser.parse(stream, handler);
    return handler.getJson();
}

And the SAXJsonParser implementation:

package xml2json;

import net.sf.json.*;
import org.apache.commons.lang.StringUtils;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import java.util.ArrayList;
import java.util.List;

public class SAXJsonParser extends DefaultHandler {

    static final String TEXTKEY = "_text";

    JSONObject result;
    List<JSONObject> stack;

    public SAXJsonParser(){}
    public JSONObject getJson(){return result;}
    public String attributeName(String name){return "@"+name;}

    public void startDocument () throws SAXException {
        stack = new ArrayList<JSONObject>();
        stack.add(0,new JSONObject());
    }
    public void endDocument () throws SAXException {result = stack.remove(0);}
    public void startElement (String uri, String localName,String qName, Attributes attributes) throws SAXException {
        JSONObject work = new JSONObject();
        for (int ix=0;ix<attributes.getLength();ix++)
            work.put( attributeName( attributes.getLocalName(ix) ), attributes.getValue(ix) );
        stack.add(0,work);
    }
    public void endElement (String uri, String localName, String qName) throws SAXException {
        JSONObject pop = stack.remove(0);       // examine stack
        Object stashable = pop;
        if (pop.containsKey(TEXTKEY)) {
            String value = pop.getString(TEXTKEY).trim();
            if (pop.keySet().size()==1) stashable = value; // single value
            else if (StringUtils.isBlank(value)) pop.remove(TEXTKEY);
        }
        JSONObject parent = stack.get(0);
        if (!parent.containsKey(localName)) {   // add new object
            parent.put( localName, stashable );
        }
        else {                                  // aggregate into arrays
            Object work = parent.get(localName);
            if (work instanceof JSONArray) {
                ((JSONArray)work).add(stashable);
            }
            else {
                parent.put(localName,new JSONArray());
                parent.getJSONArray(localName).add(work);
                parent.getJSONArray(localName).add(stashable);
            }
        }
    }
    public void characters (char ch[], int start, int length) throws SAXException {
        JSONObject work = stack.get(0);            // aggregate characters
        String value = (work.containsKey(TEXTKEY) ? work.getString(TEXTKEY) : "" );
        work.put(TEXTKEY, value+new String(ch,start,length) );
    }
    public void warning (SAXParseException e) throws SAXException {
        System.out.println("warning  e=" + e.getMessage());
    }
    public void error (SAXParseException e) throws SAXException {
        System.err.println("error  e=" + e.getMessage());
    }
    public void fatalError (SAXParseException e) throws SAXException {
        System.err.println("fatalError  e=" + e.getMessage());
        throw e;
    }
}
焚却相思 2024-10-26 16:33:16

将完整的 docx 文件转换为 JSON 看起来不是一个好主意,因为 docx 是一种以文档为中心的 XML 格式,而 JSON 是一种以数据为中心的格式。一般来说,XML 被设计为以文档和数据为中心。尽管将以文档为中心的 XML 转换为 JSON 在技术上是可行的,但处理生成的数据可能过于复杂。尝试专注于实际需要的数据并仅转换该部分。

Converting complete docx files into JSON does not look like a good idea, because docx is a document centric XML format and JSON is a data centric format. XML in general is designed to be both, document and data centric. Though it is technical possible to convert document centric XML into JSON, handling the generated data might be overly complex. Try to focus on the actual needed data and convert only that part.

仲春光 2024-10-26 16:33:16

如果您需要能够在 XML 转换为 JSON 之前对其进行操作,或者想要对表示形式进行细粒度控制,请使用 XStream。在 xml 到对象、json 到对象、对象到 xml 和对象到 json 之间进行转换非常容易。以下是 XStream 文档 中的示例:

XML

<person>
  <firstname>Joe</firstname>
  <lastname>Walnes</lastname>
  <phone>
    <code>123</code>
    <number>1234-456</number>
  </phone>
  <fax>
    <code>123</code>
    <number>9999-999</number>
  </fax>
</person>

POJO (DTO)

public class Person {
    private String firstname;
    private String lastname;
    private PhoneNumber phone;
    private PhoneNumber fax;
    // ... constructors and methods
}

Convert from XML to POJO:

String xml = "<person>...</person>";
XStream xstream = new XStream();
Person person = (Person)xstream.fromXML(xml);

And then from POJO to JSON:

XStream xstream = new XStream(new JettisonMappedXmlDriver());
String json = xstream.toXML(person);

注意:虽然该方法读取 toXML() XStream 将生成 JSON,因为使用了 Jettison 驱动程序。

If you need to be able to manipulate your XML before it gets converted to JSON, or want fine-grained control of your representation, go with XStream. It's really easy to convert between: xml-to-object, json-to-object, object-to-xml, and object-to-json. Here's an example from XStream's docs:

XML

<person>
  <firstname>Joe</firstname>
  <lastname>Walnes</lastname>
  <phone>
    <code>123</code>
    <number>1234-456</number>
  </phone>
  <fax>
    <code>123</code>
    <number>9999-999</number>
  </fax>
</person>

POJO (DTO)

public class Person {
    private String firstname;
    private String lastname;
    private PhoneNumber phone;
    private PhoneNumber fax;
    // ... constructors and methods
}

Convert from XML to POJO:

String xml = "<person>...</person>";
XStream xstream = new XStream();
Person person = (Person)xstream.fromXML(xml);

And then from POJO to JSON:

XStream xstream = new XStream(new JettisonMappedXmlDriver());
String json = xstream.toXML(person);

Note: although the method reads toXML() XStream will produce JSON, since the Jettison driver is used.

冰之心 2024-10-26 16:33:16

如果您有 xml 片段的有效 dtd 文件,那么您可以使用开源 eclipse 链接 jar 轻松将 xml 转换为 json,并将 json 转换为 xml。详细的示例 JAVA 项目可以在这里找到: http://www.cubicrace.com/2015/06/How-to-convert-XML-to-JSON-format.html

If you have a valid dtd file for the xml snippet, then you can easily convert xml to json and json to xml using the open source eclipse link jar. Detailed sample JAVA project can be found here: http://www.cubicrace.com/2015/06/How-to-convert-XML-to-JSON-format.html

因为看清所以看轻 2024-10-26 16:33:16

用于

xmlSerializer.setForceTopLevelObject(true)

在生成的 JSON 中包含根元素。

你的代码会是这样的

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();
xmlSerializer.setForceTopLevelObject(true);
JSON json = xmlSerializer.read(xml);

Use

xmlSerializer.setForceTopLevelObject(true)

to include root element in resulting JSON.

Your code would be like this

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();
xmlSerializer.setForceTopLevelObject(true);
JSON json = xmlSerializer.read(xml);
旧人九事 2024-10-26 16:33:16

Docx4j

我以前使用过docx4j,值得一看。

unXml

您还可以查看我的开源 unXml-库,该库位于 Maven 中心

它是轻量级的,并且具有简单的语法来从 xml 中挑选 XPath,并将它们作为 中的 Json 属性返回杰克逊 ObjectNode

Docx4j

I've used docx4j before, and it's worth taking a look at.

unXml

You could also check out my open source unXml-library that is available on Maven Central.

It is lightweight, and has a simple syntax to pick out XPaths from your xml, and get them returned as Json attributes in a Jackson ObjectNode.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文