在 Java 中无需 root 即可解析 XML 文件
我有这个没有根节点的 XML 文件。除了手动添加“假”根元素之外,还有什么方法可以用 Java 解析 XML 文件吗?谢谢。
I have this XML file which doesn't have a root node. Other than manually adding a "fake" root element, is there any way I would be able to parse an XML file in Java? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我想您可以创建一个新的 InputStream 实现来包装您将要解析的输入流。此实现将在来自包装流的字节之前返回开始根标记的字节,然后返回结束根标记的字节。这做起来相当简单。
我可能也面临这个问题。遗留代码,是吗?
伊恩.
编辑:您还可以查看 java.io.SequenceInputStream,它允许您将流相互追加。您需要将前缀和后缀放入字节数组中并将它们包装在 ByteArrayInputStreams 中,但这一切都相当简单。
I suppose you could create a new implementation of InputStream that wraps the one you'll be parsing from. This implementation would return the bytes of the opening root tag before the bytes from the wrapped stream and the bytes of the closing root tag afterwards. That would be fairly simple to do.
I may be faced with this problem too. Legacy code, eh?
Ian.
Edit: You could also look at java.io.SequenceInputStream which allows you to append streams to one another. You would need to put your prefix and suffix in byte arrays and wrap them in ByteArrayInputStreams but it's all fairly straightforward.
您的 XML 文档需要一个根 xml 元素才能被视为格式良好。如果没有这个,您将无法使用 xml 解析器来解析它。
Your XML document needs a root xml element to be considered well formed. Without this you will not be able to parse it with an xml parser.
一种方法是提供您自己的虚拟包装器,而不触及原始的“xml”(格式不正确的“xml”),需要这个词:
语法
示例:
One way is to provide your own dummy wrapper without touching the original 'xml' (the not well formed 'xml') Need the word for that:
Syntax
Example:
您可以使用另一个解析器,例如 Jsoup。它可以在没有根的情况下解析 XML。
You could use another parser like Jsoup. It can parse XML without a root.
我认为即使任何 API 有一个选项,它也只会返回“XML”的第一个节点,它看起来像根,并丢弃其余的节点。
所以答案可能是自己做。 Scanner 或 StringTokenizer 可能可以解决这个问题。
也许一些 html 解析器可能会有所帮助,它们通常不太严格。
I think even if any API would have an option for this, it will only return you the first node of the "XML" which will look like a root and discard the rest.
So the answer is probably to do it yourself. Scanner or StringTokenizer might do the trick.
Maybe some html parsers might help, they are usually less strict.
这就是我所做的:
有一个旧的
java.io.SequenceInputStream
类,它太旧了,它需要Enumeration
而不是List
等。有了它,您可以在无根 XML 流周围添加根元素标签(在我的例子中为
和
)。 (由于性能和内存原因,您不应该通过连接字符串来完成此操作。)
从这里您可以做任何您喜欢的事情,但请记住额外的元素。
Here's what I did:
There's an old
java.io.SequenceInputStream
class, which is so old that it takesEnumeration
rather thanList
or such.With it, you can prepend and append the root element tags (
<div>
and</div>
in my case) around your no-root XML stream. (You shouldn't do it by concatenating Strings due to performance and memory reasons.)From here you can do whatever you like, but keep in mind the extra element.