StAX - 使用 XMLStreamWriter 设置版本和编码

发布于 2024-09-03 10:04:42 字数 729 浏览 3 评论 0原文

我使用 StAX 创建 XML 文件,然后使用 XSD 验证该文件。

我在创建 XML 文件时遇到错误:

javax.xml.stream.XMLStreamException: Underlying stream encoding 'Cp1252' and input paramter for writeStartDocument() method 'UTF-8' do not match.
        at com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeStartDocument(XMLStreamWriterImpl.java:1182)

这是代码片段:

XMLOutputFactory xof =  XMLOutputFactory.newInstance();

try{

  XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));
  xtw.writeStartDocument("UTF-8","1.0");} catch(XMLStreamException e) {
  e.printStackTrace();

} catch(IOException ie) {

  ie.printStackTrace();

}

我在 Unix 上运行此代码。有人知道如何设置版本和编码风格吗?

I am using StAX for creating XML files and then validating the file with and XSD.

I am getting an error while creating the XML file:

javax.xml.stream.XMLStreamException: Underlying stream encoding 'Cp1252' and input paramter for writeStartDocument() method 'UTF-8' do not match.
        at com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeStartDocument(XMLStreamWriterImpl.java:1182)

Here is the code snippet:

XMLOutputFactory xof =  XMLOutputFactory.newInstance();

try{

  XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));
  xtw.writeStartDocument("UTF-8","1.0");} catch(XMLStreamException e) {
  e.printStackTrace();

} catch(IOException ie) {

  ie.printStackTrace();

}

I am running this code on Unix. Does anybody know how to set the version and encoding style?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

木槿暧夏七纪年 2024-09-10 10:04:42

我也会尝试将 createXMLStreamWriter() 与输出参数一起使用。

[编辑]尝试过,它通过更改 createXMLStreamWriter 行来工作:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");

[编辑2]做了一些更复杂的测试,以记录:

String fileName = "Test.xml";
XMLOutputFactory xof =  XMLOutputFactory.newInstance();
XMLStreamWriter xtw = null;
try
{
  xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");
  xtw.writeStartDocument("UTF-8", "1.0");
  xtw.writeStartElement("root");
  xtw.writeComment("This is an attempt to create an XML file with StAX");

  xtw.writeStartElement("foo");
  xtw.writeAttribute("order", "1");
    xtw.writeStartElement("meuh");
    xtw.writeAttribute("active", "true");
      xtw.writeCharacters("The cows are flying high this Spring");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeStartElement("bar");
  xtw.writeAttribute("order", "2");
    xtw.writeStartElement("tcho");
    xtw.writeAttribute("kola", "K");
      xtw.writeCharacters("Content of tcho tag");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeEndElement();
  xtw.writeEndDocument();
}
catch (XMLStreamException e)
{
  e.printStackTrace();
}
catch (IOException ie)
{
  ie.printStackTrace();
}
finally
{
  if (xtw != null)
  {
    try
    {
      xtw.close();
    }
    catch (XMLStreamException e)
    {
      e.printStackTrace();
    }
  }
}

I would try to use the createXMLStreamWriter() with an output parameter too.

[EDIT] Tried, it works by changing the createXMLStreamWriter line:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");

[EDIT 2] Made a little more complex test, for the record:

String fileName = "Test.xml";
XMLOutputFactory xof =  XMLOutputFactory.newInstance();
XMLStreamWriter xtw = null;
try
{
  xtw = xof.createXMLStreamWriter(new FileOutputStream(fileName), "UTF-8");
  xtw.writeStartDocument("UTF-8", "1.0");
  xtw.writeStartElement("root");
  xtw.writeComment("This is an attempt to create an XML file with StAX");

  xtw.writeStartElement("foo");
  xtw.writeAttribute("order", "1");
    xtw.writeStartElement("meuh");
    xtw.writeAttribute("active", "true");
      xtw.writeCharacters("The cows are flying high this Spring");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeStartElement("bar");
  xtw.writeAttribute("order", "2");
    xtw.writeStartElement("tcho");
    xtw.writeAttribute("kola", "K");
      xtw.writeCharacters("Content of tcho tag");
    xtw.writeEndElement();
  xtw.writeEndElement();

  xtw.writeEndElement();
  xtw.writeEndDocument();
}
catch (XMLStreamException e)
{
  e.printStackTrace();
}
catch (IOException ie)
{
  ie.printStackTrace();
}
finally
{
  if (xtw != null)
  {
    try
    {
      xtw.close();
    }
    catch (XMLStreamException e)
    {
      e.printStackTrace();
    }
  }
}
旧人哭 2024-09-10 10:04:42

这应该有效:

// ...
Writer writer = new OutputStreamWriter(new FileOutputStream(fileName), "UTF-8");
XMLStreamWriter xtw = xof.createXMLStreamWriter(writer);
xtw.writeStartDocument("UTF-8", "1.0");
// ...

This should work:

// ...
Writer writer = new OutputStreamWriter(new FileOutputStream(fileName), "UTF-8");
XMLStreamWriter xtw = xof.createXMLStreamWriter(writer);
xtw.writeStartDocument("UTF-8", "1.0");
// ...
白衬杉格子梦 2024-09-10 10:04:42

从代码中很难确定,但如果您依赖 JDK 1.6 提供的默认 Stax 实现 (Sun sjsxp),我建议升级为使用 伍德斯托克斯
众所周知,它比 Sjsxp 的错误更少,支持整个 Stax2 API,并且得到了积极的开发和支持(而 Sun 版本刚刚编写,错误修复的数量有限)。

但代码中的错误是这样的:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));

您依赖于默认平台编码(必须是 CP-1252,Windows?)。您应该始终明确指定您正在使用的编码。流编写器只是验证您没有做危险的事情,并发现可能导致文档损坏的不一致。非常聪明,这实际上表明这不是默认的 Stax 处理器。 :-)

(另一个答案也指出了一个正确的解决方法,只需传递 OutputStream 和编码即可让 XMLStreamWriter 做正确的事情)

From the code it is hard to know for sure, but if you are relying on the default Stax implementation that JDK 1.6 provides (Sun sjsxp) I would recommend upgrading to use Woodstox.
It is known to be less buggy than Sjsxp, supports the whole Stax2 API and has been actively developed and supported (whereas Sun version was just written and there has been limited number of bug fixes).

But the bug in your code is this:

XMLStreamWriter xtw = xof.createXMLStreamWriter(new FileWriter(fileName));

you are relying on the default platform encoding (which must be CP-1252, windows?). You should always explicitly specify encoding you are using. Stream writer is just verifying that you are not doing something dangerous, and spotted inconsistence that can cause corrupt document. Pretty smart, which actually suggests that this is not the default Stax processor. :-)

(the other answer points a correct workaround, too, by just passing OutputStream and encoding to let XMLStreamWriter do the right thing)

记忆消瘦 2024-09-10 10:04:42

如果使用与 Oracle JRE/JDK 捆绑在一起的默认 XMLStreamWriter,您应该始终

  • 创建一个 XMLStreamWriter,显式设置字符编码:xmlOutputFactory .createXMLStreamWriter(in,encoding)
  • 启动文档并显式设置编码:xmlStreamWriter.writeStartDocument(encoding, version)。 writer 不够聪明,无法记住创建 writer 时的编码集。但是,它会检查这些编码是否相同。请参阅下面的代码。

这样,您的文件编码和 XML 声明始终保持同步。尽管在 XML 声明中指定编码是可选的,但 XML 最佳实践是始终指定它。

这是来自 Oracle (Sun) 实现 (Sjsxp) 的代码:

String streamEncoding = null;
if (fWriter instanceof OutputStreamWriter) {
    streamEncoding = ((OutputStreamWriter) fWriter).getEncoding();
}
else if (fWriter instanceof UTF8OutputStreamWriter) {
    streamEncoding = ((UTF8OutputStreamWriter) fWriter).getEncoding();
}
else if (fWriter instanceof XMLWriter) {
    streamEncoding = ((OutputStreamWriter) ((XMLWriter)fWriter).getWriter()).getEncoding();
}

if (streamEncoding != null && !streamEncoding.equalsIgnoreCase(encoding)) {
    // If the equality check failed, check for charset encoding aliases
    boolean foundAlias = false;
    Set aliases = Charset.forName(encoding).aliases();
    for (Iterator it = aliases.iterator(); !foundAlias && it.hasNext(); ) {
        if (streamEncoding.equalsIgnoreCase((String) it.next())) {
            foundAlias = true;
        }
    }
    // If no alias matches the encoding name, then report error
    if (!foundAlias) {
        throw new XMLStreamException("Underlying stream encoding '"
                + streamEncoding
                + "' and input paramter for writeStartDocument() method '"
                + encoding + "' do not match.");
    }
}

If using the default XMLStreamWriter bundled with the Oracle JRE/JDK you should always

  • create a XMLStreamWriter, explicitly setting the character encoding: xmlOutputFactory.createXMLStreamWriter(in, encoding)
  • start the document and explicitly setting the encoding: xmlStreamWriter.writeStartDocument(encoding, version). The writer is not smart enough remembering the encoding set when the writer was created. However, it checks if these encodings are the same. See code below.

This way, your file encoding and XML declaration are always in sync. Although specifying the encoding in the XML declaration is optional, XML best practice is to always specify it.

This is the code from the Oracle (Sun) implementation (Sjsxp):

String streamEncoding = null;
if (fWriter instanceof OutputStreamWriter) {
    streamEncoding = ((OutputStreamWriter) fWriter).getEncoding();
}
else if (fWriter instanceof UTF8OutputStreamWriter) {
    streamEncoding = ((UTF8OutputStreamWriter) fWriter).getEncoding();
}
else if (fWriter instanceof XMLWriter) {
    streamEncoding = ((OutputStreamWriter) ((XMLWriter)fWriter).getWriter()).getEncoding();
}

if (streamEncoding != null && !streamEncoding.equalsIgnoreCase(encoding)) {
    // If the equality check failed, check for charset encoding aliases
    boolean foundAlias = false;
    Set aliases = Charset.forName(encoding).aliases();
    for (Iterator it = aliases.iterator(); !foundAlias && it.hasNext(); ) {
        if (streamEncoding.equalsIgnoreCase((String) it.next())) {
            foundAlias = true;
        }
    }
    // If no alias matches the encoding name, then report error
    if (!foundAlias) {
        throw new XMLStreamException("Underlying stream encoding '"
                + streamEncoding
                + "' and input paramter for writeStartDocument() method '"
                + encoding + "' do not match.");
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文