使用 html 编码或转义字符加载 XML 或 XHTML 内容
我正在开发一个内容管理系统的课程。输入内容以 XHTML 格式提供。它可以包含有效的转义字符,例如 £
请参阅下面的示例。
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head xmlns="">
<meta name="Attr_DocumentTitle" content="Hello World Books" />
</head>
<body>
<div>British Pound £</div>
<div>Registered sign ®</div>
<div>Copyright sign © </div>
</body>
</html>
我的目标是编写一个方法,将其加载到 XML .Net 对象中,进行一些处理并保存到数据库中。我想保持转义字符的原样。这是我的方法:
public static XmlDocument LoadXmlFromString(string xhtmlContent)
{
byte[] xhtmlByte = Encoding.ASCII.GetBytes(xhtmlContent);
MemoryStream mStream = new MemoryStream(xhtmlByte);
XmlReaderSettings settings = new XmlReaderSettings();
//Upon loading XML, prevent DTD download, which would be blocked by our
//firewall and generate "503 Server Unavailable" error.
settings.XmlResolver = null;
settings.ProhibitDtd = false;
XmlReader reader = XmlReader.Create(mStream, settings);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xhtmlContent);
return xmlDoc; //Value of xmlDoc.InnerXml contains £ ® © in place
// of £ ® and ©
}
但是,此方法将转义字符转换为其等效字符。我怎样才能避免这种情况并保留转义的字符。
I'm developing a class for a content management system. The input content is supplied in XHTML format. And it can contain valid escaped characters like £
See the example below.
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head xmlns="">
<meta name="Attr_DocumentTitle" content="Hello World Books" />
</head>
<body>
<div>British Pound £</div>
<div>Registered sign ®</div>
<div>Copyright sign © </div>
</body>
</html>
My objective is to write a method that loads this to an XML .Net object do some processing and save to database. I want to maintain the escaped characters as they are. And here is my method:
public static XmlDocument LoadXmlFromString(string xhtmlContent)
{
byte[] xhtmlByte = Encoding.ASCII.GetBytes(xhtmlContent);
MemoryStream mStream = new MemoryStream(xhtmlByte);
XmlReaderSettings settings = new XmlReaderSettings();
//Upon loading XML, prevent DTD download, which would be blocked by our
//firewall and generate "503 Server Unavailable" error.
settings.XmlResolver = null;
settings.ProhibitDtd = false;
XmlReader reader = XmlReader.Create(mStream, settings);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xhtmlContent);
return xmlDoc; //Value of xmlDoc.InnerXml contains £ ® © in place
// of £ ® and ©
}
This method however converts the escaped characters to their character equivalents. How can I avoid this and keep the escaped characters.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
检查一下:
为什么 xmltextreader 转换 html 编码的 utf8 字符自动转为utf8字符串
Check this:
why does xmltextreader convert html encoded utf8 characters to utf8 string automatically