我们的项目几天前已经从 XmlDocument 转换为使用 XDocument,但是我们在使用 XDocument.Parse 处理属性值中的 XML 实体时发现了一个奇怪的行为,示例代码如下:
-
XML 字符串:
string xml = @"";
-
XmlDocument.LoadXml 代码和结果:
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);
Console.WriteLine(xmlDocument.OuterXml);
结果:
<字符符号=“�” />
-
XDocument.Parse 代码和异常:
XDocument xDocument = XDocument.Parse(xml);
Console.WriteLine(xDocument.ToString());
异常:
System.Xml.dll 中发生了“System.Xml.XmlException”类型的第一次机会异常
'.',十六进制值 0x00,是无效字符。 1 号线,18 号位置。
在 System.Xml.XmlTextReaderImpl.Throw(异常 e)
在 System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
在 System.Xml.XmlTextReaderImpl.Throw(Int32 pos, String res, String[] args)
在System.Xml.XmlTextReaderImpl.ParseNumericCharRefInline(Int32 startPos,布尔展开,StringBuilder internalSubsetBuilder,Int32&charCount,EntityType&entityType)
在System.Xml.XmlTextReaderImpl.ParseNumericCharRef(布尔扩展,StringBuilder内部SubsetBuilder,EntityType&entityType)
在 System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue、EntityExpandType ExpandType、Int32& charRefEndPos)
在 System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos,Char quoteChar,NodeData attr)
在 System.Xml.XmlTextReaderImpl.ParseAttributes()
在 System.Xml.XmlTextReaderImpl.ParseElement()
在 System.Xml.XmlTextReaderImpl.ParseDocumentContent()
在 System.Xml.XmlTextReaderImpl.Read()
在 System.Xml.Linq.XDocument.Load(XmlReader 阅读器,LoadOptions 选项)
在 System.Xml.Linq.XDocument.Parse(字符串文本,LoadOptions 选项)
at System.Xml.Linq.XDocument.Parse(String text)
看来“�”是无效字符,因此我们将值更改为有效字符,例如“`”然后两种方法都效果很好。
有没有办法改变 XDocument.Parse 行为以忽略属性中的无效字符,就像 XmlDocument.LoadXml 那样?
Our project has been converted to use XDocument from XmlDocument few days ago, but we found a strange behavior while processing XML entity in attribute value with XDocument.Parse, the sample code as following:
-
The XML string:
string xml = @"<char symbol="""">";
-
The XmlDocument.LoadXml code and result:
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);
Console.WriteLine(xmlDocument.OuterXml);
Result:
<char symbol="" />
-
The XDocument.Parse code and exception:
XDocument xDocument = XDocument.Parse(xml);
Console.WriteLine(xDocument.ToString());
Exception:
A first chance exception of type 'System.Xml.XmlException' occurred in System.Xml.dll
'.', hexadecimal value 0x00, is an invalid character. Line 1, position 18.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
at System.Xml.XmlTextReaderImpl.Throw(Int32 pos, String res, String[] args)
at System.Xml.XmlTextReaderImpl.ParseNumericCharRefInline(Int32 startPos, Boolean expand, StringBuilder internalSubsetBuilder, Int32& charCount, EntityType& entityType)
at System.Xml.XmlTextReaderImpl.ParseNumericCharRef(Boolean expand, StringBuilder internalSubsetBuilder, EntityType& entityType)
at System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue, EntityExpandType expandType, Int32& charRefEndPos)
at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos, Char quoteChar, NodeData attr)
at System.Xml.XmlTextReaderImpl.ParseAttributes()
at System.Xml.XmlTextReaderImpl.ParseElement()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text)
It seems that the "" is an invalid character, so we change the value to a valid character such as "`" then both methods worked well.
Is there any way to change the XDocument.Parse behavior to ignore the invalid character in attribute like XmlDocument.LoadXml does?
发布评论
评论(1)
根据 this arctice值 � 实际上是无效的。我亲身体验到 XDocument 类遵循比 XmlDocument 严格得多的 XML 标准(我认为这是一件好事)。
阅读这篇文章,他们给出了如何解决该错误的建议。
According to this arctice the value � is actually invalid. I've experienced myself that the XDocument class follows the XML standard much stricter than XmlDocument (which I think is a good thing).
Read the article, they give suggestions how to get around that error.