当前位置：文江博客话题详情

XML Java

Java XML 创建

发布于 2024-11-15 08:45:04 字数 2852 浏览 0 评论 0 原文

我正在尝试从 XML 文档中获取属性 id (fileID)，以用作 XML 拆分的文件名。拆分工作正常，我只需要提取 fileID 用作名称即可。

[编辑] 我现在可以读取该属性，但它不会创建最后一个 xml 文件。因此，在我的示例中，它创建了具有正确名称的前 2 个文件，但未创建最后一个文件 ID“000154OP.XML”。有人可以帮忙吗？

这是我的 xml 文档

<root>
 <envelope fileID="000152OP.XML">
   <record id="850">
   </record>
</envelope>
<envelope fileID="000153OP.XML">
  <record id="850">
  </record>
</envelope>
<envelope fileID="000154OP.XML">
  <record id="850">
  </record>
</envelope>
</root>

这是我的 Java 代码

    public static void splitXMLFile (String file) throws Exception {         
    String[] temp;
    String[] temp2;
    String[] temp3;
    String[] temp4;
    String[] temp5;
    String[] temp6;
    File input = new File(file);         
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();         
    Document doc = dbf.newDocumentBuilder().parse(input);
    XPath xpath = XPathFactory.newInstance().newXPath();          
    NodeList nodes = (NodeList) xpath.evaluate("//root/envelope", doc, XPathConstants.NODESET);          
    int itemsPerFile = 1;         

    Node staff = doc.getElementsByTagName("envelope").item(0);

    NamedNodeMap attr = staff.getAttributes();
    Node nodeAttr = attr.getNamedItem("fileID");
    String node = nodeAttr.toString();
    temp = node.split("=");
    temp2 = temp[1].split("^\"");
    temp3 = temp2[1].split("\\.");

    Document currentDoc = dbf.newDocumentBuilder().newDocument();         
    Node rootNode = currentDoc.createElement("root");   
    File currentFile = new File("C:\\XMLFiles\\" + temp3[0]+ ".xml"); 

    for (int i=1; i <= nodes.getLength(); i++) {             
        Node imported = currentDoc.importNode(nodes.item(i-1), true);             
        rootNode.appendChild(imported); 

        Node staff2 = doc.getElementsByTagName("envelope").item(i);
        NamedNodeMap attr2 = staff2.getAttributes();
        Node nodeAttr2 = attr2.getNamedItem("fileID");
        String node2 = nodeAttr2.toString();
        temp4 = node2.split("=");
        temp5 = temp4[1].split("^\"");
        temp6 = temp5[1].split("\\.");

        if (i % itemsPerFile == 0) { 

            writeToFile(rootNode, currentFile);                  
            rootNode = currentDoc.createElement("root");    
            currentFile = new File("C:\\XMLFiles\\" + temp6[0]+".xml");


        }         
    }          
    writeToFile(rootNode, currentFile);     
}    

 private static void writeToFile(Node node, File file) throws Exception {         
     Transformer transformer = TransformerFactory.newInstance().newTransformer();         
     transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));     
 }

原文

I'm trying to get an attribute id (fileID) from my XML document to use as the filename for my XML split. The split works I just need to extract the fileID to use as the name.

[EDITED] I can read the attribute now but it doesn't create the last xml file. So in my example it create the first 2 files with the correct name but last fileID "000154OP.XML" isn't created. Can Anyone Help?

This is my xml document

<root>
 <envelope fileID="000152OP.XML">
   <record id="850">
   </record>
</envelope>
<envelope fileID="000153OP.XML">
  <record id="850">
  </record>
</envelope>
<envelope fileID="000154OP.XML">
  <record id="850">
  </record>
</envelope>
</root>

And here's my Java code

    public static void splitXMLFile (String file) throws Exception {         
    String[] temp;
    String[] temp2;
    String[] temp3;
    String[] temp4;
    String[] temp5;
    String[] temp6;
    File input = new File(file);         
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();         
    Document doc = dbf.newDocumentBuilder().parse(input);
    XPath xpath = XPathFactory.newInstance().newXPath();          
    NodeList nodes = (NodeList) xpath.evaluate("//root/envelope", doc, XPathConstants.NODESET);          
    int itemsPerFile = 1;         

    Node staff = doc.getElementsByTagName("envelope").item(0);

    NamedNodeMap attr = staff.getAttributes();
    Node nodeAttr = attr.getNamedItem("fileID");
    String node = nodeAttr.toString();
    temp = node.split("=");
    temp2 = temp[1].split("^\"");
    temp3 = temp2[1].split("\\.");

    Document currentDoc = dbf.newDocumentBuilder().newDocument();         
    Node rootNode = currentDoc.createElement("root");   
    File currentFile = new File("C:\\XMLFiles\\" + temp3[0]+ ".xml"); 

    for (int i=1; i <= nodes.getLength(); i++) {             
        Node imported = currentDoc.importNode(nodes.item(i-1), true);             
        rootNode.appendChild(imported); 

        Node staff2 = doc.getElementsByTagName("envelope").item(i);
        NamedNodeMap attr2 = staff2.getAttributes();
        Node nodeAttr2 = attr2.getNamedItem("fileID");
        String node2 = nodeAttr2.toString();
        temp4 = node2.split("=");
        temp5 = temp4[1].split("^\"");
        temp6 = temp5[1].split("\\.");

        if (i % itemsPerFile == 0) { 

            writeToFile(rootNode, currentFile);                  
            rootNode = currentDoc.createElement("root");    
            currentFile = new File("C:\\XMLFiles\\" + temp6[0]+".xml");


        }         
    }          
    writeToFile(rootNode, currentFile);     
}    

 private static void writeToFile(Node node, File file) throws Exception {         
     Transformer transformer = TransformerFactory.newInstance().newTransformer();         
     transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));     
 }

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

别靠近我心 2024-11-22 08:45:04

您的代码中有很多重复，但我有一个解决方案可以删除很多重复。我知道有不太复杂的解决方案（例如，我认为不需要 if (i % itemsPerFile == 0) 逻辑，但我不知道您的所有要求，所以我离开了您遇到

的主要问题是用错误的数据覆盖最后一个文件，而且您的循环逻辑是重复的。我遵循的一个好的经验法则是，每当我认为我可能必须重复代码时，您的逻辑就会出现问题。正在考虑第一个 与其余元素分开，而它们应该被视为 3 个一组。那么您的逻辑需要只是依次对每个元素应用相同的搜索、拆分、匹配、导入等。

复杂的是，您的输入 XML 文件具有相同的的 id="850"> 我将其更改为 850、851 和 。 852。运行原始代码，生成 3 个文件，000152OP.xml、000153OP.xml 和000154OP.xml，但第一个包含 851 记录。所以我立刻就知道循环逻辑是不正确的。

下面详细介绍了一个更简单的解决方案，将您的输入 XML 文件作为参数，在同一目录中生成 3 个输出文件（为了简单起见，我删除了 C:\ 硬编码），每个文件都具有正确的 < code> 元素。

import java.io.*;
import java.util.Random;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;

public class SplitXML {
    public static void main(String[] args) throws Exception {
        File input = new File(args[0]);
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        Document doc = dbf.newDocumentBuilder().parse(input);
        XPath xpath = XPathFactory.newInstance().newXPath();
        NodeList nodes = (NodeList) xpath.evaluate("//root/envelope", doc, XPathConstants.NODESET);
        int itemsPerFile = 1;

        Document currentDoc = dbf.newDocumentBuilder().newDocument();

        for (int i=0; i < nodes.getLength(); i++) {
            Node rootNode = currentDoc.createElement("root");

            Node imported = currentDoc.importNode(nodes.item(i), true);
            rootNode.appendChild(imported);

            Node staff = doc.getElementsByTagName("envelope").item(i);
            NamedNodeMap attr = staff.getAttributes();
            Node nodeAttr = attr.getNamedItem("fileID");
            String filename = nodeAttr.getNodeValue();
            String[] fileParts = filename.split("\\.");

            if (i % itemsPerFile == 0) {
                File currentFile = new File(fileParts[0] + "." + fileParts[1].toLowerCase());
                writeToFile(rootNode, currentFile);
            }
        }
    }

    private static void writeToFile(Node node, File file) throws Exception {
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));
    }
}

您应该阅读 Node 并字符串::拆分，因为有不必要的额外内容已存在本机方法的代码（例如 [Node::getNodeValue()][3]）。

编辑：用于创建 1000 个元素的源代码，用于测试上述代码：

import java.io.*;

public class CreateXML {
    public static void main(String[] args) throws Exception {
        FileWriter fstream = new FileWriter(new File("split.xml"));
        BufferedWriter out = new BufferedWriter(fstream);
        out.write("<root>");
        for (int i = 0; i < 1000; i++) {
            out.write("<envelope fileID=\"000" + i +"P.XML\"><record id=\"" + i + "\"></record></envelope>\n");
        }
        out.write("</root>");
        out.close();
    }
}

我运行 java CreateXML 来创建输入文件 split.xml，然后使用 java SplitXML split.xml 创建 1000 个文件。

There is a lot of duplication in your code but I have a solution that removes a lot of it. I know there are less complex solutions (for example I don't think the if (i % itemsPerFile == 0) logic is required, but I do not know all of your requirements, so I have left it in.

The main problems you have were overwriting the last file with wrong data but also that your looping logic was duplicated. A good rule of thumb I go by is whenever I think I might have to duplicate code there is something wrong. Your logic was considering the first <envelope> separately to the remaining <envelope> elements, whereas they should be considered as a group of 3. Then your logic need only to apply the same searching, splitting, matching, importing, etc… to each element in turn.

What complicated matters, is that your input XML file had the same <record id="850"> for each <envelope>. I changed mine to 850, 851 and 852. Running your original code, produced 3 files, 000152OP.xml, 000153OP.xml and 000154OP.xml, but the first one contained the 851 record. So I immediately knew the looping logic was incorrect.

A simpler solution is detailed below, which given your input XML file as the argument produces 3 output files in the same directory (I removed the C:\ hard-coding for simplicity), each with the correct <record> element.

import java.io.*;
import java.util.Random;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;

public class SplitXML {
    public static void main(String[] args) throws Exception {
        File input = new File(args[0]);
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        Document doc = dbf.newDocumentBuilder().parse(input);
        XPath xpath = XPathFactory.newInstance().newXPath();
        NodeList nodes = (NodeList) xpath.evaluate("//root/envelope", doc, XPathConstants.NODESET);
        int itemsPerFile = 1;

        Document currentDoc = dbf.newDocumentBuilder().newDocument();

        for (int i=0; i < nodes.getLength(); i++) {
            Node rootNode = currentDoc.createElement("root");

            Node imported = currentDoc.importNode(nodes.item(i), true);
            rootNode.appendChild(imported);

            Node staff = doc.getElementsByTagName("envelope").item(i);
            NamedNodeMap attr = staff.getAttributes();
            Node nodeAttr = attr.getNamedItem("fileID");
            String filename = nodeAttr.getNodeValue();
            String[] fileParts = filename.split("\\.");

            if (i % itemsPerFile == 0) {
                File currentFile = new File(fileParts[0] + "." + fileParts[1].toLowerCase());
                writeToFile(rootNode, currentFile);
            }
        }
    }

    private static void writeToFile(Node node, File file) throws Exception {
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));
    }
}

You should read up on Node and String::split as there was unnecessary extra code where a native method already exists (for example [Node::getNodeValue()][3]).

Edit: The source for creating 1000 <envelope> elements that I used to test the above code:

import java.io.*;

public class CreateXML {
    public static void main(String[] args) throws Exception {
        FileWriter fstream = new FileWriter(new File("split.xml"));
        BufferedWriter out = new BufferedWriter(fstream);
        out.write("<root>");
        for (int i = 0; i < 1000; i++) {
            out.write("<envelope fileID=\"000" + i +"P.XML\"><record id=\"" + i + "\"></record></envelope>\n");
        }
        out.write("</root>");
        out.close();
    }
}

I ran java CreateXML to create the input file split.xml and then java SplitXML split.xml to create the 1000 files.

回复收藏 0 原文

段念尘 2024-11-22 08:45:04

尝试

 for (int i=0; i < nodes.getLength(); i++) {}

代替

 for (int i=1; i <= nodes.getLength(); i++) {}

Try

 for (int i=0; i < nodes.getLength(); i++) {}

instead of

 for (int i=1; i <= nodes.getLength(); i++) {}

回复收藏 0 原文

分分钟 2024-11-22 08:45:04

writeToFile(Node节点，File文件)的修改版本。这将关闭输出流。在不关闭outputStream的情况下，很难处理删除、移动文件等文件操作。

private static void writeToFile(Node node, File file){
    Transformer transformer = null;
    StreamResult sr = null;
    try {
        transformer = TransformerFactory.newInstance().newTransformer();
         sr = new StreamResult(new FileOutputStream(file,false));
        transformer.transform(new DOMSource(node), sr);
        } catch (TransformerFactoryConfigurationError | TransformerException | FileNotFoundException e) {
            e.printStackTrace();
        } finally{
        try {
            sr.getOutputStream().close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

Modified version of writeToFile(Node node, File file). This will Close outputStream. without closing the outputStream , it is difficult to handle file operations like delete, move file operation.

private static void writeToFile(Node node, File file){
    Transformer transformer = null;
    StreamResult sr = null;
    try {
        transformer = TransformerFactory.newInstance().newTransformer();
         sr = new StreamResult(new FileOutputStream(file,false));
        transformer.transform(new DOMSource(node), sr);
        } catch (TransformerFactoryConfigurationError | TransformerException | FileNotFoundException e) {
            e.printStackTrace();
        } finally{
        try {
            sr.getOutputStream().close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

}

回复收藏 0 原文

~没有更多了~