使用 JAXB 对与号 (&) 进行 Java XML 解组失败
我有以下 XML:
<?xml version="1.0" encoding="UTF-8"?>
<details>
...
<address1>Test&Address</address1>
...
</details>
当我尝试使用 JAXB 对其进行解组时,它会引发以下异常:
Caused by: org.xml.sax.SAXParseException: The reference to entity "Address" must end with the ';' delimiter.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)
但是当我将 XML 中的 &
更改为 ',它有效。看起来问题只出在 & 符号
&
上,我不明白为什么。
解组的代码是:
JAXBContext context = JAXBContext.newInstance("some.package.name", this.getClass().getClassLoader());
Unmarshaller unmarshaller = context.createUnmarshaller();
obj = unmarshaller.unmarshal(new StringReader(xml));
有人有一些见解吗?
编辑:我尝试了下面@abhin4v建议的解决方案(即在 &
之后添加一个空格),但它似乎也不起作用。这是堆栈跟踪:
Caused by: org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)
I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<details>
...
<address1>Test&Address</address1>
...
</details>
When I try to unmarshal it using JAXB, it throws the following exception:
Caused by: org.xml.sax.SAXParseException: The reference to entity "Address" must end with the ';' delimiter.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)
But when I changed the &
in the XML to '
, it works. Looks like the problem is only with ampersand &
and I cannot understand why.
The code to unmarshal is:
JAXBContext context = JAXBContext.newInstance("some.package.name", this.getClass().getClassLoader());
Unmarshaller unmarshaller = context.createUnmarshaller();
obj = unmarshaller.unmarshal(new StringReader(xml));
Anyone have some insight?
EDIT: I tried the solution suggested by @abhin4v below (ie, add a space after &
), but it doesn't seem to work too. Here's the stacktrace:
Caused by: org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我也遇到过这个。第一遍我只是将 & 替换为令牌字符串 (AMPERSAND_TOKEN),通过 JAXB 发送它,然后重新替换 & 符号。不理想,但这是一个快速修复。
第二遍我做了很多重大改变,所以我不确定到底是什么解决了问题。我怀疑提供 JAXB 对 html dtds 的访问使其更快乐,但这只是一个猜测,可能特定于我的项目。
华泰
I've run into this too. First pass I simply replaced the & to a token string (AMPERSAND_TOKEN), sent it through JAXB, then re-replaced the ampersand. Not ideal, but it was a quick fix.
Second pass I made a lot of significant changes, so I'm not sure what exactly solved the problem. I suspect that providing JAXB access to the html dtds made it much happier, but that's only a guess and could be specific to my project.
HTH
Xerces 将
&
转换为&
,然后尝试解析&Address
,但失败,因为它不以结尾;
。在放置空格将不起作用,因为 Xerces 现在将尝试解析&
和Address
之间放置一个空格,它应该可以工作。&
并抛出 OP 中给出的第二个错误。您可以将测试包装在 CDATA 部分中,Xerces 将不会尝试解析实体。Xerces converts
&
to&
and then tries to resolve&Address
which fails because it does not end with;
.Put a space betweenPutting a space will not work as Xerces will now try to resolve&
andAddress
and it should work.&
and throw the second error given in OP. You can wrap the test in a CDATA section and Xerces will not try to resolve the entities.事实证明,问题是由于我正在使用的框架(Mentawai 框架)造成的。所述XML来自HTTP请求的POST主体。
显然,框架转换了 XML 正文中的字符实体,因此,
&
变为&
并且解组器无法解组 XML。It turns out that the problem is because of the framework I'm using (Mentawai framework). The said XML comes from the POST body of an HTTP request.
Apparently, the framework converts the character entities in the XML body, therefore,
&
becomes&
and the unmarshaller fails to unmarshal the XML.我发现添加 amp; 将修复解组错误。您希望它看起来像这样:
我认为这告诉解组器应将&符号读取为数据值(在本例中为文本)而不是实体标识符。您可以通过错误看到它正在尝试将紧跟在
&
之后的“Address”视为实体名称I've found that adding amp; will fix the unmarshalling error. You want it to look like this:
I think this tells the unmarshaller that the ampersand should be read as a data value (text in this case) instead of an entity identifier. You can see by your errors that it's attempting to view "Address", which immediately follows the
&
, as an entity name