Saxon-js覆盖文档（）函数，该函数在同一原始策略上不会失败

发布于 2025-01-25 22:28:05 字数 3338 浏览 2 评论 0原文

我正在尝试使用Saxon-JS转换本地XML文件。而且我遇到了愚蠢的相同来源限制，用于

示例表加载，
使用文档函数加载所有内容的

我称之为同一原始策略“愚蠢”，因为它可以轻松地通过加载脚本来规避，这使所有相同的来源都出现了相当可悲。现在使用此主题，这是我的操作方式：当浏览器加载像Test.xml的文件时，

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <script xmlns="http://www.w3.org/1999/xhtml" type="text/javascript" src="boot.js" defer="true"/>
  <foo/>
  <bar/>
</root>

我们可以让boot.js执行此操作：

function appendScript(src, fn) {
  const lastScript = ((x) => x[x.length - 1])(document.getElementsByTagNameNS("http://www.w3.org/1999/xhtml", "script"));
  const script = document.createElementNS("http://www.w3.org/1999/xhtml", "script");
  script.type = "text/javascript";  
  script.charset = 'utf-8';
  script.src = src; 
  script.defer = true;
  script.async = true;
  if(fn)
    script.onload = fn;
  lastScript.parentElement.insertBefore(script, lastScript.nextSibling);
  return script;
}   

appendScript("SaxonJS2.rt.js");
appendScript("test.sef.js");

tools = {}

function transform() {
  SaxonJS.transform({
      sourceNode: document,
      stylesheetText: style,
      destination: "document"}, 'async')
  .then(result => {
      document.replaceChildren();
      document.appendChild(result.principalResult); });

setTimeout(transform, 500);

类似。请注意，我已经无法使用stylesheetLocation选项，因为那将尝试失败的XHR。但是，如果我简单地将test.sef.json文件变成一个JavaScript，则是这样：

(echo -n 'style='; jq '.|tostring'  test.sef.json ; echo ';') > test.sef.js

那么，包含脚本的包含将加载此JSON字符串作为变量样式。

现在，我可以在XSLT中替换正常文档功能

<xsl:function name="f:document">
    <xsl:param name="url"/>
    <xsl:choose>
        <xsl:when test="true() or function-available('js:tools.document')">
            <xsl:sequence select="js:document.xslt.document($url)"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:if test="doc-available($url)">
                <xsl:sequence select="document($url)"/>
            </xsl:if>
        </xsl:otherwise>
    </xsl:choose>
</xsl:function>

，我注意到该功能可用的某种方式不起作用。但这可能是一个小问题，对于我的测试来说，这并不重要，我在此处放置true（）或...。

现在，这就是该工具的方式。document函数的定义：

tools.document = function(url) {
  if(!tools.docs) tools.docs = {};  
  let doc = tools.docs[url];
  if(!doc) {
    const script = this.appendScript(url+".js", function() {
                     const parser = new DOMParser();
                     doc = parser.parseFromString(data, "text/xml");
                     if(doc)
                       tools.docs[url] = doc;
                     delete window.data;
                     script.remove(); });
  }
  return doc;
}

如您所见，这些XML文档的加载也被伪装成脚本，例如include.xml.js：

data=`<doc><foo/><bar/></doc>`

因此，这就是问题所在。转换以同步模式运行，并在此js：tools.document（）函数的呼叫下运行，我们需要放开当前执行线程，以使脚本元素被采购和执行。只有这样，我们才能返回并解析此数据并将结果粘贴到我们的文档缓存中。

因此，如果我调用f：document（“ include.xml”），那么我们就无法在其同步结果时获得该文档。但是，如果我在执行转换（）活动之前从JavaScript控制台手动击中一次

tools.document("include.xml")

，那么转换将在文档正确采购的情况下取得成功。

我想知道我可以在执行转换的同时最少做什么以将控制权回到主要的JavaScript循环中？我想知道是否有某种技巧可以通过某种延续封闭来捕获当前状态，并通过将其添加到Settimeout队列中来恢复它？

原文

I am trying to use Saxon-JS to transforms a local XML file. And I run into the silly same origin restriction, for

loading of the stylesheet
loading of anything using the document function

I call the same-origin policy "silly" because it is easily circumvented by loading scripts instead, which makes all this same origin stuff appear rather pathetic. Now working with this theme, here is how I go about it: when the browser loads a file like this test.xml

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <script xmlns="http://www.w3.org/1999/xhtml" type="text/javascript" src="boot.js" defer="true"/>
  <foo/>
  <bar/>
</root>

We can have the boot.js do this:

function appendScript(src, fn) {
  const lastScript = ((x) => x[x.length - 1])(document.getElementsByTagNameNS("http://www.w3.org/1999/xhtml", "script"));
  const script = document.createElementNS("http://www.w3.org/1999/xhtml", "script");
  script.type = "text/javascript";  
  script.charset = 'utf-8';
  script.src = src; 
  script.defer = true;
  script.async = true;
  if(fn)
    script.onload = fn;
  lastScript.parentElement.insertBefore(script, lastScript.nextSibling);
  return script;
}   

appendScript("SaxonJS2.rt.js");
appendScript("test.sef.js");

tools = {}

function transform() {
  SaxonJS.transform({
      sourceNode: document,
      stylesheetText: style,
      destination: "document"}, 'async')
  .then(result => {
      document.replaceChildren();
      document.appendChild(result.principalResult); });

setTimeout(transform, 500);

Something like that. Notice that already I cannot use stylesheetLocation option, because that would try an XHR which would fail. But if I simply turn the test.sef.json file into a javascript, like this:

(echo -n 'style='; jq '.|tostring'  test.sef.json ; echo ';') > test.sef.js

then the inclusion with a script will load this JSON string as the variable style.

In the same way now, I can in XSLT replace the normal document function with

<xsl:function name="f:document">
    <xsl:param name="url"/>
    <xsl:choose>
        <xsl:when test="true() or function-available('js:tools.document')">
            <xsl:sequence select="js:document.xslt.document($url)"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:if test="doc-available($url)">
                <xsl:sequence select="document($url)"/>
            </xsl:if>
        </xsl:otherwise>
    </xsl:choose>
</xsl:function>

to which I note that the function-available somehow doesn't work. But that could be a small issue, and for my testing it doesn't matter and I put the true() or ... here.

Now this is how this tools.document function is defined:

tools.document = function(url) {
  if(!tools.docs) tools.docs = {};  
  let doc = tools.docs[url];
  if(!doc) {
    const script = this.appendScript(url+".js", function() {
                     const parser = new DOMParser();
                     doc = parser.parseFromString(data, "text/xml");
                     if(doc)
                       tools.docs[url] = doc;
                     delete window.data;
                     script.remove(); });
  }
  return doc;
}

As you can see, these XML documents are loaded in also disguised as scripts, such as include.xml.js:

data=`<doc><foo/><bar/></doc>`

So, here is the problem. The transform is running in synchronous mode, and on the call of this js:tools.document() function, we need to let go of the current execution thread in order for the script element to be sourced and executed. Only then can we return to it and parse this data and stick the result into our document cache.

So, if I call f:document("include.xml") then we cannot get that document upon its synchronous result. But if I hit it manually once from the javascript console

tools.document("include.xml")

before executing the transform() activity, then the transform will succeed with the document properly sourced.

I wonder what I could do minimally to give back control to the main javascript loop while still executing the transform? I wonder if there is some sort of trick by which one might capture the current state in some kind of continuation closure, and get back to it by adding that to the setTimeout queue?

分享到QQ

分享到微博