Java 说 XML 文档格式不正确

发布于 09-01 17:27 字数 3147 浏览 9 评论 0原文

Java 的 XML 解析器似乎认为我的 XML 文档在根元素之后的格式不正确。但我用几种工具验证了它,但它们都不同意。这可能是我的代码中的错误,而不是文档本身的错误。我真的很感激你们能为我提供的任何帮助。

这是我的 Java 方法:

private void loadFromXMLFile(File f) throws ParserConfigurationException, IOException, SAXException {
    File file = f;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db;
    Document doc = null;
    db = dbf.newDocumentBuilder();
    doc = db.parse(file);
    doc.getDocumentElement().normalize();
    String desc = "";
    String due = "";
    String comment = "";
    NodeList tasksList = doc.getElementsByTagName("task");
    for (int i = 0; i  tasksList.getLength(); i++) {
        NodeList attributes = tasksList.item(i).getChildNodes();
        for (int j = 0; i < attributes.getLength(); j++) {
        Node attribute = attributes.item(i);
        if (attribute.getNodeName() == "description") {
            desc = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "due") {
            due = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "comment") {
            comment = attribute.getTextContent();
        }
        tasks.add(new Task(desc, due, comment));
        }
        desc = "";
        due = "";
        comment = "";
    }
}

以下是我尝试加载的 XML 文件:

<?xml version="1.0"?>  
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

这是 java 为我抛出的错误消息:

run:
[Fatal Error] tasks.xml:28:3: The markup in the document following the root element must be well-formed.
May 17, 2010 6:07:02 PM todolist.TodoListGUI <init>
SEVERE: null
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
        at todolist.TodoListGUI.loadFromXMLFile(TodoListGUI.java:199)
        at todolist.TodoListGUI.<init>(TodoListGUI.java:42)
        at todolist.Main.main(Main.java:25)
BUILD SUCCESSFUL (total time: 19 seconds)

供参考 TodoListGUI.java:199 是

doc = db.parse(file);

如果上下文对这里的任何人有帮助,我会尝试编写一个简单的 GUI 应用程序来管理待办事项列表,该列表可以读取和写入定义任务的 XML 文件。

Java's XML parser seems to be thinking that my XML document is not well formed following the root element. But I've validated it with several tools and they all disagree. It's probably an error in my code rather than in the document itself. I'd really appreciate any help you all could offer me.

Here is my Java method:

private void loadFromXMLFile(File f) throws ParserConfigurationException, IOException, SAXException {
    File file = f;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db;
    Document doc = null;
    db = dbf.newDocumentBuilder();
    doc = db.parse(file);
    doc.getDocumentElement().normalize();
    String desc = "";
    String due = "";
    String comment = "";
    NodeList tasksList = doc.getElementsByTagName("task");
    for (int i = 0; i  tasksList.getLength(); i++) {
        NodeList attributes = tasksList.item(i).getChildNodes();
        for (int j = 0; i < attributes.getLength(); j++) {
        Node attribute = attributes.item(i);
        if (attribute.getNodeName() == "description") {
            desc = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "due") {
            due = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "comment") {
            comment = attribute.getTextContent();
        }
        tasks.add(new Task(desc, due, comment));
        }
        desc = "";
        due = "";
        comment = "";
    }
}

The following is the XML file I'm trying to load:

<?xml version="1.0"?>  
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

And here is the error message java is throwing for me:

run:
[Fatal Error] tasks.xml:28:3: The markup in the document following the root element must be well-formed.
May 17, 2010 6:07:02 PM todolist.TodoListGUI <init>
SEVERE: null
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
        at todolist.TodoListGUI.loadFromXMLFile(TodoListGUI.java:199)
        at todolist.TodoListGUI.<init>(TodoListGUI.java:42)
        at todolist.Main.main(Main.java:25)
BUILD SUCCESSFUL (total time: 19 seconds)

For reference TodoListGUI.java:199 is

doc = db.parse(file);

If context is helpful to anyone here, I'm trying to write a simple GUI application to manage a todo list that can read and write to and from XML files defining the tasks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

桜花祭2024-09-08 17:27:58

org.xml.sax.SAXParseException:文档中根元素后面的标记必须格式正确。

这一特殊异常表明 XML 文档中存在多个根元素。换句话说, 不是唯一的根元素。以 XML 文档为例,请考虑一个不带 元素但在根中包含三个 元素的文档。这会导致这种异常。

由于您发布的 XML 文件看起来不错,所以问题出在其他地方。看起来它没有解析您期望它解析的 XML 文件。为了快速调试,请将以下内容添加到方法顶部:

System.out.println(f.getAbsolutePath());

在磁盘文件系统中找到该文件并验证它。

org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.

This particular exception indicates that there is more than one root element in the XML document. In other words, the <tasklist> is not the only root element. To take your XML document as an example, think of one without the <tasklist> element and with three <task> elements in the root. This would cause this kind of exception.

Since the XML file you posted looks fine, the problem lies somewhere else. It look like that it is not parsing the XML file you expect that it is parsing. For quick debugging, add the following to top of your method:

System.out.println(f.getAbsolutePath());

Locate the file in the disk file system and verify it.

小草泠泠2024-09-08 17:27:58

我认为实际文件可能有问题。当我复制您的代码但使用 XML 作为解析器的字符串输入时,它工作正常(在修复了几个问题之后 - attributes.item(i) 应该是 attributes.item(j ) 并且当 attribute == null) 时需要跳出循环。

在尝试重现您的错误时,如果添加另一个 元素,我会收到相同的消息。这是因为 XML 不再具有单个根元素(任务列表)。这是您所看到的问题吗? tasks.xml 中的 XML 是否有单个根元素?

I think there may be something wrong with the actual file. When I copy your code but use the XML as a string input to the parser it works fine (after fixing a couple of issues - attributes.item(i) should be attributes.item(j) and you need to break out of the loop when attribute == null).

In trying to reproduce your error, I can get the same message if I add another <tasklist></tasklist> element. This is because the XML no longer has a single root element (tasklist). Is this the problem you are seeing? Does the XML in tasks.xml have a single root element?

巷雨优美回忆2024-09-08 17:27:58

尝试将 XML 声明更改为:

<?xml version="1.0" encoding="UTF-8" ?>

Try changing your XML declaration to:

<?xml version="1.0" encoding="UTF-8" ?>
烦人精2024-09-08 17:27:58

无论如何,Scala REPL 成功解析了您的标记。

scala> val tree = <tasklist>
 | <task>
 | <description>Task 1</description>
 | <due>Due date 1</due>
 | <comment>Comment 1</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 2</description>
 | <due>Due date 2</due>
 | <comment>Comment 2</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 3</description>
 | <due>Due date 3</due>
 | <comment>Comment 3</comment>
 | <completed>true</completed>
 | </task>
 | </tasklist>
tree: scala.xml.Elem = 
<tasklist>
<task>
<description>Task 1</description>
<due>Due date 1</due>
<comment>Comment 1</comment>
<completed>false</completed>
</task>
<task>
<description>Task 2</description>
<due>Due date 2</due>
<comment>Comment 2</comment>
<completed>false</completed>
</task>
<task>
<description>Task 3</description>
<due>Due date 3</due>
<comment>Comment 3</comment>
<completed>true</completed>
</task>
</tasklist>

For what it's worth, the Scala REPL successfully parsed your markup.

scala> val tree = <tasklist>
 | <task>
 | <description>Task 1</description>
 | <due>Due date 1</due>
 | <comment>Comment 1</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 2</description>
 | <due>Due date 2</due>
 | <comment>Comment 2</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 3</description>
 | <due>Due date 3</due>
 | <comment>Comment 3</comment>
 | <completed>true</completed>
 | </task>
 | </tasklist>
tree: scala.xml.Elem = 
<tasklist>
<task>
<description>Task 1</description>
<due>Due date 1</due>
<comment>Comment 1</comment>
<completed>false</completed>
</task>
<task>
<description>Task 2</description>
<due>Due date 2</due>
<comment>Comment 2</comment>
<completed>false</completed>
</task>
<task>
<description>Task 3</description>
<due>Due date 3</due>
<comment>Comment 3</comment>
<completed>true</completed>
</task>
</tasklist>
恍梦境°2024-09-08 17:27:58

另一个值得一提的是,这是我将 xml 保存到名为 test.xml 的文件中并通过 xmllint

[jhr@Macintosh] [~]
xmllint test.xml
<?xml version="1.0"?>
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

似乎还好。很可能您有一些在实际文件中看不到的杂散字符。尝试在编辑器中查看实际文件,该编辑器将显示不可打印的字符,就像其他人建议的那样,如果这不是英语 UTF-8 机器,您可能会有一些解析器看不到的 Unicode 字符。或者你没有加载你认为的文件。在将文件送入解析器之前,进行逐步调试并查看文件的实际内容是什么。

Another for what its worth, here is what I get when I saved your xml into a file called test.xml and ran it thru xmllint.

[jhr@Macintosh] [~]
xmllint test.xml
<?xml version="1.0"?>
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

seems to be fine. most likely you have some stray characters that you can't see in there somewhere in your actual file. Try viewing the actual file in an editor that will show non-printable characters, like someone else suggested if this isn't an English UTF-8 machine you might have some Unicode characters that you can't see that the parser does. That or you aren't loading the file that you think you are. Step debugging and see what the actual contents of the file are before it gets fed into the parser.

雅心素梦2024-09-08 17:27:58

您确定这就是该文件中的所有内容吗?该错误抱怨当前根之后有更多标记。所以后面一定还有别的东西。

有时,此错误可能是由不可打印的字符引起的。如果您没有看到任何内容,请对文件进行十六进制转储。

Are you sure that's the everything in that file? The error is complaining that there are more markup after the current root. So there must be something else after </tasklist>.

Sometimes, this error may be caused by non-printable characters. If you don't see anything, do a hexdump of the file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文