如何使用自定义 XmlUrlResolver 将 XHTML 文件加载到 XElement 中?

发布于 2024-08-20 06:23:50 字数 1334 浏览 7 评论 0原文

我正在尝试将 XHTML 文件加载到 LINQ XElement 中。但是,我在解析器方面遇到了问题。问题与以下定义有关:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

我有一个自定义 XmlUrlResolver,其中包含一个重写的 GetEntity,它可以转换链接,例如 http://www.w3.org/TR/xhtml1/DTD/ xhtml1-transitional.dtd 到本地资源流。这几乎适用于整个 XHTML DTD。我唯一无法真正解决的是 Uri“-//W3C//DTD XHTML 1.0 Transitional//EN”,我不确定应该用它做什么。

    public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
    {
        var resourceName = "ePub.DTD." + absoluteUri.Segments[absoluteUri.Segments.GetLength(0) - 1];
        if (_resources.Contains(resourceName))
        {
            Stream dataStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName);
            return dataStream;
        }
        return base.GetEntity(absoluteUri, role, ofObjectToReturn);
    }

正如您在上面的代码中看到的,我无法解析的任何内容都由默认的 XmlUrlResolver 处理。这意味着上面的链接以-//W3C/ 开头。然而,基本方法会抛出一个DirectoryNotFoundException。继续将加载 XElement 就好了。如果我返回一个空流,则会导致在将 XHTML 加载到 XElement 期间抛出错误。

有人可能对使用自定义 XmlUrlResolver 处理此类 PUBLIC 定义有任何线索吗?

I am trying to get an XHTML file loaded into an LINQ XElement. However, I am running into problems with the resolver. The problem has to do with the following definition:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

I have a custom XmlUrlResolver with an overridden GetEntity which converts links such as
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd to a local resource stream. This works fine for almost the entire XHTML DTD. The only one I am unable to actually resolve is the Uri "-//W3C//DTD XHTML 1.0 Transitional//EN" and I am not sure what I should be doing with it.

    public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
    {
        var resourceName = "ePub.DTD." + absoluteUri.Segments[absoluteUri.Segments.GetLength(0) - 1];
        if (_resources.Contains(resourceName))
        {
            Stream dataStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName);
            return dataStream;
        }
        return base.GetEntity(absoluteUri, role, ofObjectToReturn);
    }

As you see in the above code, anything I cannot resolve is handled by the default XmlUrlResolver. This means the above link starting with -//W3C/. The base method however throws an DirectoryNotFoundException however. Continuing will load the XElement just fine. If I instead return an empty stream it causes an error to be throw during loading of the XHTML into the XElement.

Any clues someone might have about handling such a PUBLIC definition with a custom XmlUrlResolver?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

陌上青苔 2024-08-27 06:23:50

从微软董事会的某个地方偷来的答案:

这种行为是设计使然的。当在 DOCTYPE 声明中同时指定公共 ID 和系统 ID 时,XmlReader 首先尝试 XmlResolver.GetEntity 是否理解公共标识符(“-//W3C//DTD XHTML 1.1//EN”)。因此,它使用公共 ID 调用 GetEntity,如果解析器不理解它(如 XmlUrlResolver),则会引发异常。 XmlReader 捕获异常并调用 GetEntity,但这次使用系统标识符 (“http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd”)。

谢谢,
-Helena Kotas,System.Xml 开发人员

Gepost 门 Microsoft 发表于 2006 年 5 月 10 日 17:34

Answer stolen from Microsoft boards, somewhere:

This behavior is by design. When both the public ID and system ID are specified in the DOCTYPE declaration, the XmlReader first tries if the XmlResolver.GetEntity understands the public identifier ("-//W3C//DTD XHTML 1.1//EN"). So it calls GetEntity with the public ID and if the resolver does not understand it (like the XmlUrlResolver), it throws an exception. The XmlReader catches the exception and calls the GetEntity, but this time with the system identifier (“http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd").

Thanks,
-Helena Kotas, System.Xml Developer

Gepost door Microsoft op 10-5-2006 om 17:34

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文