解析包含“ ”的 XML 字符串（必须保存）

发布于 2024-08-31 02:39:18 字数 1629 浏览 7 评论 0原文

我有传递包含 XML 的 string 的代码。此 XML 可能包含一个或多个   实例（空白字符的实体引用）。我要求不应解析这些引用（即，不应将它们替换为实际的空格字符）。

我有什么办法可以实现这一目标吗？

基本上，给定一个包含 XML 的字符串：

<pattern value="[A-Z0-9&#x20;]" />

我不希望将其转换为：（

<pattern value="[A-Z0-9 ]" />

我实际上想要实现的目标是简单地获取 XML 字符串并将其写入“漂亮打印” " 文件。这会将字符串中出现的   解析为需要保留的单个空格字符。此要求的原因是写入的XML 文档必须符合外部定义的规范。）

我尝试创建 XmlTextReader 的子类来读取 XML 字符串并重写 ResolveEntity() 方法，但这没有被调用。我还尝试分配自定义 XmlResolver。

按照建议，我也尝试过“双重编码”。不幸的是，这并没有达到预期的效果，因为 & 没有被解析器解码。这是我使用的代码：

string schemaText = @"...<pattern value=""[A-Z0-9&#x26;#x20;]"" />...";
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.Indent = true;
writerSettings.NewLineChars = Environment.NewLine;
writerSettings.Encoding = Encoding.Unicode;
writerSettings.CloseOutput = true;
writerSettings.OmitXmlDeclaration = false;
writerSettings.IndentChars = "\t";

StringBuilder writtenSchema = new StringBuilder();
using ( StringReader sr = new StringReader( schemaText ) )
using ( XmlReader reader = XmlReader.Create( sr ) )
using ( TextWriter tr = new StringWriter( writtenSchema ) )
using ( XmlWriter writer = XmlWriter.Create( tr, writerSettings ) )
{
   XPathDocument doc = new XPathDocument( reader );
   XPathNavigator nav = doc.CreateNavigator();

   nav.WriteSubtree( writer );
}

编写的 XML 最终为：

<pattern value="[A-Z0-9&amp;#x20;]" />

原文

I have code that is passed a string containing XML. This XML may contain one or more instances of (an entity reference for the blank space character). I have a requirement that these references should not be resolved (i.e. they should not be replaced with an actual space character).

Is there any way for me to achieve this?

Basically, given a string containing the XML:

<pattern value="[A-Z0-9 ]" />

I do not want it to be converted to:

<pattern value="[A-Z0-9 ]" />

(What I am actually trying to achieve is to simply take an XML string and write it to a "pretty-printed" file. This is having the side-effect of resolving occurrences of in the string to a single space character, which need to be preserved. The reason for this requirement is that the written XML document must conform to an externally-defined specification.)

I have tried creating a sub-class of XmlTextReader to read from the XML string and overriding the ResolveEntity() method, but this isn't called. I have also tried assigning a custom XmlResolver.

I have also tried, as suggested, to "double encode". Unfortunately, this has not had the desired effect, as the & is not decoded by the parser. Here is the code I used:

string schemaText = @"...<pattern value=""[A-Z0-9&#x20;]"" />...";
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.Indent = true;
writerSettings.NewLineChars = Environment.NewLine;
writerSettings.Encoding = Encoding.Unicode;
writerSettings.CloseOutput = true;
writerSettings.OmitXmlDeclaration = false;
writerSettings.IndentChars = "\t";

StringBuilder writtenSchema = new StringBuilder();
using ( StringReader sr = new StringReader( schemaText ) )
using ( XmlReader reader = XmlReader.Create( sr ) )
using ( TextWriter tr = new StringWriter( writtenSchema ) )
using ( XmlWriter writer = XmlWriter.Create( tr, writerSettings ) )
{
   XPathDocument doc = new XPathDocument( reader );
   XPathNavigator nav = doc.CreateNavigator();

   nav.WriteSubtree( writer );
}

The written XML ends up with: