当前位置：文江博客话题详情

Java，XML DocumentBuilder - 解析时设置编码

发布于 2024-09-16 03:54:18 字数 482 浏览 11 评论 0原文

我正在尝试保存一棵树（扩展JTree），该树将XML文档保存为已更改其结构的DOM对象。

我创建了一个新的文档对象，遍历树以成功检索内容（包括 XML 文档的原始编码），现在有一个包含树的 ByteArrayInputStream具有正确编码的内容（XML 文档）。

问题是当我解析 ByteArrayInputStream 时，编码会自动更改为 UTF-8 （在 XML 文档中）。

有没有办法防止这种情况并使用 ByteArrayInputStream 中提供的正确编码。

还值得补充的是，我已经使用过
transformer.setOutputProperty(OutputKeys.ENCODING,encoding) 方法来检索正确的编码。

任何帮助将不胜感激。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁吢 2024-09-23 03:54:18

这是一个更新的答案，因为 OutputFormat 已被弃用：

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");

第二部分将返回 XML 文档作为字符串

Here's an updated answer since OutputFormat is deprecated :

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");

The second part will return the XML Document as String

回复收藏 0 原文

败给现实 2024-09-23 03:54:18

// Read XML
String xml = "xml"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));

// Append formatting
OutputFormat format = new OutputFormat(document);

if (document.getXmlEncoding() != null) {
  format.setEncoding(document.getXmlEncoding());
}

format.setLineWidth(100);
format.setIndenting(true);
format.setIndent(5);
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, format);
serializer.serialize(document);
String result = out.toString();

// Read XML
String xml = "xml"
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));

// Append formatting
OutputFormat format = new OutputFormat(document);

if (document.getXmlEncoding() != null) {
  format.setEncoding(document.getXmlEncoding());
}

format.setLineWidth(100);
format.setIndenting(true);
format.setIndent(5);
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, format);
serializer.serialize(document);
String result = out.toString();

回复收藏 0 原文

寄人书 2024-09-23 03:54:18

经过大量的尝试和错误，我解决了这个问题。

我正在使用

OutputFormat format = new OutputFormat(document);

，但将其更改为

OutputFormat format = new OutputFormat(d, encoding, true);

，这解决了我的问题。

encoding 是我设置的
true 指是否设置缩进。

自我注意 - 更仔细地阅读 - 几个小时前我已经看过 javadoc - 如果我能更仔细地阅读就好了。

I solved it, given alot of trial and errors.

I was using

OutputFormat format = new OutputFormat(document);

but changed it to

OutputFormat format = new OutputFormat(d, encoding, true);

and this solved my problem.

encoding is what I set it to be
true refers to whether or not indent is set.

Note to self - read more carefully - I had looked at the javadoc hours ago - if only I'd have read more carefully.

回复收藏 0 原文

囍笑 2024-09-23 03:54:18

这对我有用并且非常简单。不需要变压器或输出格式化程序：

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(inputStream);
is.setEncoding("ISO-8859-1"); // set your encoding here
Document document = builder.parse(is);

This worked for me and is very simple. No need for a transformer or output formatter:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(inputStream);
is.setEncoding("ISO-8859-1"); // set your encoding here
Document document = builder.parse(is);

回复收藏 0 原文

~没有更多了~