Java,XML DocumentBuilder - 解析时设置编码

发布于 2024-09-16 03:54:18 字数 482 浏览 11 评论 0原文

我正在尝试保存一棵树(扩展JTree),该树将XML文档保存为已更改其结构的DOM对象

我创建了一个新的文档对象,遍历树以成功检索内容(包括 XML 文档的原始编码),现在有一个包含树的 ByteArrayInputStream具有正确编码的内容(XML 文档)。

问题是当我解析 ByteArrayInputStream 时,编码会自动更改为 UTF-8 (在 XML 文档中)。

有没有办法防止这种情况并使用 ByteArrayInputStream 中提供的正确编码。

还值得补充的是,我已经使用过
transformer.setOutputProperty(OutputKeys.ENCODING,encoding) 方法来检索正确的编码。

任何帮助将不胜感激。

I'm trying to save a tree (extends JTree) which holds an XML document to a DOM Object having changed it's structure.

I have created a new document object, traversed the tree to retrieve the contents successfully (including the original encoding of the XML document), and now have a ByteArrayInputStream which has the tree contents (XML document) with the correct encoding.

The problem is when I parse the ByteArrayInputStream the encoding is changed to UTF-8 (in the XML document) automatically.

Is there a way to prevent this and use the correct encoding as provided in the ByteArrayInputStream.

It's also worth adding that I have already used the
transformer.setOutputProperty(OutputKeys.ENCODING, encoding) method to retrieve the right encoding.

Any help would be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

岁吢 2024-09-23 03:54:18

这是一个更新的答案,因为 OutputFormat 已被弃用:

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");

第二部分将返回 XML 文档作为字符串

Here's an updated answer since OutputFormat is deprecated :

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");

The second part will return the XML Document as String

败给现实 2024-09-23 03:54:18
// Read XML
String xml = "xml"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));

// Append formatting
OutputFormat format = new OutputFormat(document);

if (document.getXmlEncoding() != null) {
  format.setEncoding(document.getXmlEncoding());
}

format.setLineWidth(100);
format.setIndenting(true);
format.setIndent(5);
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, format);
serializer.serialize(document);
String result = out.toString();
// Read XML
String xml = "xml"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));

// Append formatting
OutputFormat format = new OutputFormat(document);

if (document.getXmlEncoding() != null) {
  format.setEncoding(document.getXmlEncoding());
}

format.setLineWidth(100);
format.setIndenting(true);
format.setIndent(5);
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, format);
serializer.serialize(document);
String result = out.toString();
寄人书 2024-09-23 03:54:18

经过大量的尝试和错误,我解决了这个问题。

我正在使用

OutputFormat format = new OutputFormat(document);

,但将其更改为

OutputFormat format = new OutputFormat(d, encoding, true);

,这解决了我的问题。

encoding 是我设置的
true 指是否设置缩进。

自我注意 - 更仔细地阅读 - 几个小时前我已经看过 javadoc - 如果我能更仔细地阅读就好了。

I solved it, given alot of trial and errors.

I was using

OutputFormat format = new OutputFormat(document);

but changed it to

OutputFormat format = new OutputFormat(d, encoding, true);

and this solved my problem.

encoding is what I set it to be
true refers to whether or not indent is set.

Note to self - read more carefully - I had looked at the javadoc hours ago - if only I'd have read more carefully.

囍笑 2024-09-23 03:54:18

这对我有用并且非常简单。不需要变压器或输出格式化程序:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(inputStream);
is.setEncoding("ISO-8859-1"); // set your encoding here
Document document = builder.parse(is);

This worked for me and is very simple. No need for a transformer or output formatter:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(inputStream);
is.setEncoding("ISO-8859-1"); // set your encoding here
Document document = builder.parse(is);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文