如何使用自定义 XmlUrlResolver 将 XHTML 文件加载到 XElement 中?
我正在尝试将 XHTML 文件加载到 LINQ XElement 中。但是,我在解析器方面遇到了问题。问题与以下定义有关:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
我有一个自定义 XmlUrlResolver,其中包含一个重写的 GetEntity,它可以转换链接,例如 http://www.w3.org/TR/xhtml1/DTD/ xhtml1-transitional.dtd 到本地资源流。这几乎适用于整个 XHTML DTD。我唯一无法真正解决的是 Uri“-//W3C//DTD XHTML 1.0 Transitional//EN”,我不确定应该用它做什么。
public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
{
var resourceName = "ePub.DTD." + absoluteUri.Segments[absoluteUri.Segments.GetLength(0) - 1];
if (_resources.Contains(resourceName))
{
Stream dataStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName);
return dataStream;
}
return base.GetEntity(absoluteUri, role, ofObjectToReturn);
}
正如您在上面的代码中看到的,我无法解析的任何内容都由默认的 XmlUrlResolver
处理。这意味着上面的链接以-//W3C/ 开头。然而,基本方法会抛出一个DirectoryNotFoundException
。继续将加载 XElement
就好了。如果我返回一个空流,则会导致在将 XHTML 加载到 XElement
期间抛出错误。
有人可能对使用自定义 XmlUrlResolver
处理此类 PUBLIC 定义有任何线索吗?
I am trying to get an XHTML file loaded into an LINQ XElement. However, I am running into problems with the resolver. The problem has to do with the following definition:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
I have a custom XmlUrlResolver with an overridden GetEntity which converts links such as
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd to a local resource stream. This works fine for almost the entire XHTML DTD. The only one I am unable to actually resolve is the Uri "-//W3C//DTD XHTML 1.0 Transitional//EN" and I am not sure what I should be doing with it.
public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn)
{
var resourceName = "ePub.DTD." + absoluteUri.Segments[absoluteUri.Segments.GetLength(0) - 1];
if (_resources.Contains(resourceName))
{
Stream dataStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName);
return dataStream;
}
return base.GetEntity(absoluteUri, role, ofObjectToReturn);
}
As you see in the above code, anything I cannot resolve is handled by the default XmlUrlResolver
. This means the above link starting with -//W3C/. The base method however throws an DirectoryNotFoundException
however. Continuing will load the XElement
just fine. If I instead return an empty stream it causes an error to be throw during loading of the XHTML into the XElement
.
Any clues someone might have about handling such a PUBLIC definition with a custom XmlUrlResolver
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从微软董事会的某个地方偷来的答案:
这种行为是设计使然的。当在 DOCTYPE 声明中同时指定公共 ID 和系统 ID 时,XmlReader 首先尝试 XmlResolver.GetEntity 是否理解公共标识符(“-//W3C//DTD XHTML 1.1//EN”)。因此,它使用公共 ID 调用 GetEntity,如果解析器不理解它(如 XmlUrlResolver),则会引发异常。 XmlReader 捕获异常并调用 GetEntity,但这次使用系统标识符 (“http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd”)。
谢谢,
-Helena Kotas,System.Xml 开发人员
Gepost 门 Microsoft 发表于 2006 年 5 月 10 日 17:34
Answer stolen from Microsoft boards, somewhere:
This behavior is by design. When both the public ID and system ID are specified in the DOCTYPE declaration, the XmlReader first tries if the XmlResolver.GetEntity understands the public identifier ("-//W3C//DTD XHTML 1.1//EN"). So it calls GetEntity with the public ID and if the resolver does not understand it (like the XmlUrlResolver), it throws an exception. The XmlReader catches the exception and calls the GetEntity, but this time with the system identifier (“http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd").
Thanks,
-Helena Kotas, System.Xml Developer
Gepost door Microsoft op 10-5-2006 om 17:34