Xdocument - 如何转换非 html 安全字符

发布于 2024-10-04 00:39:24 字数 461 浏览 6 评论 0原文

我的 UTF-8 xml 元素内有一个“title”属性，例如

<tag title="This is some test with special chars §£" />

，因为我希望该属性的内容直接打印在 HTML 页面中，所以我尝试获得如下输出：

<tag title="This is some test with special chars &#x00a7;&#x00a3;" />

我添加的代码片段有属性看起来像这样：

new XElement( "tag",
    new XAttribute( "title" , title)
);

字符如 &和 " 被转义，但 §£ 没有 - 因为它们是有效的 utf-8 字符。我应该改变什么？

原文

I have a "title" attribute inside elements of my UTF-8 xml, e.g.

<tag title="This is some test with special chars §£" />

as I want the content of this attribute to be printed directly in an HTML page, I'm trying to have an output like:

<tag title="This is some test with special chars §£" />

The code fragment where I add there attribute looks like this:

new XElement( "tag",
    new XAttribute( "title" , title)
);

Characters such as & and " are escaped, but §£ are not - as they're valid utf-8 characters.
What should I change?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

番薯 2024-10-11 00:39:24

如果页面声明为 UTF-8，则 HTML 中支持 UTF-8 字符。

您应该始终指定编码
用于 HTML 或 XML 页面。如果你
不，你会冒这个角色的风险
您的内容不正确
解释了。这不仅仅是一个问题
人类可读性，越来越
机器需要理解你的数据
也。您还应该检查您是否
没有指定不同的编码
在不同的地方。

如果页面的默认编码是范围较小的字符集，则它将无法正确呈现所有 UTF-8 字符。但是，如果文档声明为 UTF-8，它们应该可以正常显示。

您可能需要显式声明您的页面为 UTF-8。

有多种方法可以执行此操作：

回复收藏 0 原文

残月升风 2024-10-11 00:39:24

也许您可以手动解码这些字符。我以前用过这个

 Dictionary<string, char> HTMLSymbolMap = new Dictionary<string, char>()
        {
            {"–",'–'},
            {"—",'—'},
            {"‘",'‘'},
            {"’",'’'},
            {"‚",'‚'},
            {"“",'“'},
            {"”",'”'},
            {"•",'•'},
            {"·",'·'},
            {"„",'„'},                
            {"£",'£'},
            {"§",'§'},

        };

   public string CleanJunk(string docText)
    {


        foreach (var kv in HTMLSymbolMap)
        {
            docText = docText.Replace(kv.value.tostring(), kv.key);
        }

        return docText;

    }

请参阅此 HTMLSymbol 表了解更多信息

May be you can manually decode those characters. I have used this before

 Dictionary<string, char> HTMLSymbolMap = new Dictionary<string, char>()
        {
            {"–",'–'},
            {"—",'—'},
            {"‘",'‘'},
            {"’",'’'},
            {"‚",'‚'},
            {"“",'“'},
            {"”",'”'},
            {"•",'•'},
            {"·",'·'},
            {"„",'„'},                
            {"£",'£'},
            {"§",'§'},

        };

   public string CleanJunk(string docText)
    {


        foreach (var kv in HTMLSymbolMap)
        {
            docText = docText.Replace(kv.value.tostring(), kv.key);
        }

        return docText;

    }

Refer this HTMLSymbol table for more info

回复收藏 0 原文

~没有更多了~