如何使用“关于:” XSLT 处理器中的 HTML5 协议

发布于 2024-11-27 08:12:11 字数 951 浏览 1 评论 0原文

HTML5 草案指定(至少目前如此) ,URI about:legacy-compat 可用于依赖于符合 XML 的 doctype 的文档(而 则不然)。

因此,我碰巧有一组 HTML5 验证 XML 文件,其开头为:

<!DOCTYPE html SYSTEM "about:legacy-compat">

不幸的是,当我将此类 XHTML5 文档与任何 XSLT 处理器(如 Xalan 或 Saxon)一起使用时,它们自然会尝试解析(无法解析的)URI。

有什么方法可以让他们忽略 URI 或在幕后进行虚假解析吗?解决这个问题的尝试发生在这些文档的早期,因此例如 Saxon 的 -dtd:off 开关在这里不起作用。

编辑:低级方法sed -n '2,$p'|不幸的是,otherapp 只能在我开始使用 document() XPath 函数加载另一个 XHTML5 文件之前起作用。

编辑2:我尝试了XML目录并得到了它们与 Saxon 和 Xalan 一起工作。然而,然后我总是得到一个

java.net.MalformedURLException: unknown protocol: about

嗯,这并不奇怪,但我怎样才能规避这个问题呢? URL 永远不应该被解析,而应该被丢弃。

The HTML5 draft specifies (at the moment at least), that the URI about:legacy-compat can be used for documents, that rely on an XML conforming doctype (which <!DOCTYPE html> isn't).

So I happen to have a bundle of HTML5-validating XML files, that start with:

<!DOCTYPE html SYSTEM "about:legacy-compat">

Unfortunately, when I use such an XHTML5 document with any XSLT processor like Xalan or Saxon, they naturally try to resolve the (unresolvable) URI.

Is there any way to bring them into ignoring the URI or faux-resolving it under the hood? The try to resolve it happens early in these documents, so for example Saxon's -dtd:off switch has no effect here.

Edit: The low-level approach sed -n '2,$p' <htmlfile> | otherapp unfortunately only works until I start to use the document() XPath function to load another XHTML5 file.

Edit 2: I played around with XML catalogs and got them to work with both Saxon and Xalan. However, then I get always a

java.net.MalformedURLException: unknown protocol: about

Well, it's not surprising, but how can I circumvent this? The URL should never be parsed, just thrown away.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

轻拂→两袖风尘 2024-12-04 08:12:11

将这个 Java 文件放入 $somepath/foo/about/

package foo.about;

import java.io.IOException;
import java.io.InputStream;
import java.io.StringBufferInputStream;
import java.net.URL;
import java.net.URLConnection;

public class Handler extends java.net.URLStreamHandler {

@Override
protected URLConnection openConnection(URL url) throws IOException  {               
    URLConnection res = new URLConnection(url) {

        @Override
        public void connect() throws IOException {
            connected = true;
        }
        @Override
        public InputStream getInputStream() throws IOException {
            return new StringBufferInputStream("<!ELEMENT html ANY>");
        }
    };
    return res;
 }
}

现在进入 $somepath 并编译它:

javac foo/about/Handler.java

在调用 Saxon 时将以下参数添加到 JVM:

-Djava.protocol.handler.pkgs=foo -cp"$somepath"

这是一个修改后的 shell 脚本脚本(针对 *nix 系统,但它与Windows):

#!/bin/sh

exec java -Djava.protocol.handler.pkgs=foo -classpath /usr/share/java/saxonb.jar:"$somepath" net.sf.saxon.Transform "$@"

如果本地 saxonb-xslt 脚本不起作用,您可能需要进行调整。

Put this Java file into $somepath/foo/about/

package foo.about;

import java.io.IOException;
import java.io.InputStream;
import java.io.StringBufferInputStream;
import java.net.URL;
import java.net.URLConnection;

public class Handler extends java.net.URLStreamHandler {

@Override
protected URLConnection openConnection(URL url) throws IOException  {               
    URLConnection res = new URLConnection(url) {

        @Override
        public void connect() throws IOException {
            connected = true;
        }
        @Override
        public InputStream getInputStream() throws IOException {
            return new StringBufferInputStream("<!ELEMENT html ANY>");
        }
    };
    return res;
 }
}

Now go in $somepath and compile it:

javac foo/about/Handler.java

Add the following arguments to the JVM when calling Saxon:

-Djava.protocol.handler.pkgs=foo -cp"$somepath"

Here is a modified shell script script (for *nix system but it it very similar for Windows):

#!/bin/sh

exec java -Djava.protocol.handler.pkgs=foo -classpath /usr/share/java/saxonb.jar:"$somepath" net.sf.saxon.Transform "$@"

You may want to adapt using your local saxonb-xslt script if it doesn't work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文