解析包含“ ”的 XML 字符串(必须保存)

发布于 2024-08-31 02:39:18 字数 1629 浏览 7 评论 0原文

我有传递包含 XML 的 string 的代码。此 XML 可能包含一个或多个   实例(空白字符的实体引用)。我要求不应解析这些引用(即,不应将它们替换为实际的空格字符)。

我有什么办法可以实现这一目标吗?

基本上,给定一个包含 XML 的字符串:

<pattern value="[A-Z0-9&#x20;]" />

希望将其转换为:(

<pattern value="[A-Z0-9 ]" />

我实际上想要实现的目标是简单地获取 XML 字符串并将其写入“漂亮打印” " 文件。这会将字符串中出现的 &#x20; 解析为需要保留的单个空格字符。此要求的原因是写入的XML 文档必须符合外部定义的规范。)

我尝试创建 XmlTextReader 的子类来读取 XML 字符串并重写 ResolveEntity() 方法,但这没有被调用。我还尝试分配自定义 XmlResolver

按照建议,我也尝试过“双重编码”。不幸的是,这并没有达到预期的效果,因为 & 没有被解析器解码。这是我使用的代码:

string schemaText = @"...<pattern value=""[A-Z0-9&#x26;#x20;]"" />...";
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.Indent = true;
writerSettings.NewLineChars = Environment.NewLine;
writerSettings.Encoding = Encoding.Unicode;
writerSettings.CloseOutput = true;
writerSettings.OmitXmlDeclaration = false;
writerSettings.IndentChars = "\t";

StringBuilder writtenSchema = new StringBuilder();
using ( StringReader sr = new StringReader( schemaText ) )
using ( XmlReader reader = XmlReader.Create( sr ) )
using ( TextWriter tr = new StringWriter( writtenSchema ) )
using ( XmlWriter writer = XmlWriter.Create( tr, writerSettings ) )
{
   XPathDocument doc = new XPathDocument( reader );
   XPathNavigator nav = doc.CreateNavigator();

   nav.WriteSubtree( writer );
}

编写的 XML 最终为:

<pattern value="[A-Z0-9&amp;#x20;]" />

I have code that is passed a string containing XML. This XML may contain one or more instances of (an entity reference for the blank space character). I have a requirement that these references should not be resolved (i.e. they should not be replaced with an actual space character).

Is there any way for me to achieve this?

Basically, given a string containing the XML:

<pattern value="[A-Z0-9 ]" />

I do not want it to be converted to:

<pattern value="[A-Z0-9 ]" />

(What I am actually trying to achieve is to simply take an XML string and write it to a "pretty-printed" file. This is having the side-effect of resolving occurrences of in the string to a single space character, which need to be preserved. The reason for this requirement is that the written XML document must conform to an externally-defined specification.)

I have tried creating a sub-class of XmlTextReader to read from the XML string and overriding the ResolveEntity() method, but this isn't called. I have also tried assigning a custom XmlResolver.

I have also tried, as suggested, to "double encode". Unfortunately, this has not had the desired effect, as the & is not decoded by the parser. Here is the code I used:

string schemaText = @"...<pattern value=""[A-Z0-9&#x20;]"" />...";
XmlWriterSettings writerSettings = new XmlWriterSettings();
writerSettings.Indent = true;
writerSettings.NewLineChars = Environment.NewLine;
writerSettings.Encoding = Encoding.Unicode;
writerSettings.CloseOutput = true;
writerSettings.OmitXmlDeclaration = false;
writerSettings.IndentChars = "\t";

StringBuilder writtenSchema = new StringBuilder();
using ( StringReader sr = new StringReader( schemaText ) )
using ( XmlReader reader = XmlReader.Create( sr ) )
using ( TextWriter tr = new StringWriter( writtenSchema ) )
using ( XmlWriter writer = XmlWriter.Create( tr, writerSettings ) )
{
   XPathDocument doc = new XPathDocument( reader );
   XPathNavigator nav = doc.CreateNavigator();

   nav.WriteSubtree( writer );
}

The written XML ends up with:

<pattern value="[A-Z0-9&#x20;]" />

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

十年不长 2024-09-07 02:39:18

如果你想保留它,你需要对其进行双重编码:。 XML 阅读器将翻译实体,这或多或少就是 XML 的工作原理。

If you want it to be preserved, you need to double-encode it: &#x20;. The XML-reader will translate entities, that's more or less how XML works.

喵星人汪星人 2024-09-07 02:39:18
<pattern value="[A-Z0-9&#x20;]" />

我上面所做的被替换为“&”与“&”从而转义&符号。

<pattern value="[A-Z0-9&#x20;]" />

What I did above is replaced "&" with "&" thereby escaping the ampersand.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文