使用 Linq-to-XML 和 C# 读取 RSS 提要 - 如何解码 CDATA 部分？

发布于 2024-08-11 04:40:42 字数 1134 浏览 5 评论 0原文

我正在尝试使用 C# 和 Linq to XML 读取 RSS 提要。 Feed 采用 utf-8 编码（请参阅 http://pc03224.kr.hsnr.de /infosys/feed/），并且将其读出通常工作正常，但描述节点除外，因为它包含在 CDATA 部分中。

由于某种原因，在读出“描述”标签的内容后，我在调试器中看不到 CDATA 标签，但我猜它一定在某个地方，因为只有在本节中，德语元音变音 (äöü) 和其他特殊字符才不存在显示正确。相反，它们保留在 utf-8 编码的字符串中，如 ü。

我能以某种方式正确读出它们或者至少在事后解码它们吗？

这是给我带来麻烦的 RSS 部分的示例：

<description><![CDATA[blabla bietet H&#246;rern meiner Vorlesungen &#8220;IAS&#8221;, &#8220;WEB&#8221; und &#8220;SWE&#8221; an, Lizenzen f&#252;r blabla [...]]]></description>

这是我的代码，它读取并解析 RSS 提要数据：

RssItems = (from xElem in xml.Descendants("channel").Descendants("item")
                            select new RssItem
                                       {
                                           Content =  xElem.Descendants("description").FirstOrDefault().Value,
                                           ...
                                       }).ToList();

提前致谢！

原文

I am trying to read an RSS feed using C# and Linq to XML.
The feed is encoded in utf-8 (see http://pc03224.kr.hsnr.de/infosys/feed/) and reading it out generally works fine except for the description node because it is enclosed in a CDATA section.

For some reason I can't see the CDATA tag in the debugger after reading out the content of the "description" tag but I guess it must be there somewhere because only in this section the German Umlaute (äöü) and other special characters are not shown correctly. Instead they remain in the string utf-8 encoded like ü.

Can I somehow read them out correctly or at least decode them afterwards?

This is a sample of the RSS section giving me troubles:

<description><![CDATA[blabla bietet Hörern meiner Vorlesungen “IAS”, “WEB” und “SWE” an, Lizenzen für blabla [...]]]></description>

Here is my code which reads out and parses the RSS feed data:

RssItems = (from xElem in xml.Descendants("channel").Descendants("item")
                            select new RssItem
                                       {
                                           Content =  xElem.Descendants("description").FirstOrDefault().Value,
                                           ...
                                       }).ToList();

Thanks in advance!

分享到QQ

分享到微博