HTML in XMLHttpRequest - Web APIs 编辑

The W3C XMLHttpRequest specification adds HTML parsing support to XMLHttpRequest, which originally supported only XML parsing. This feature allows Web apps to obtain an HTML resource as a parsed DOM using XMLHttpRequest.

To get an overview of how to use XMLHttpRequest in general, see Using XMLHttpRequest.

Limitations

To discourage the synchronous use of XMLHttpRequest, HTML support is not available in the synchronous mode. Also, HTML support is only available if the responseType property has been set to "document". This limitation avoids wasting time parsing HTML uselessly when legacy code uses XMLHttpRequest in the default mode to retrieve responseText for text/html resources. Also, this limitation avoids problems with legacy code that assumes that responseXML is null for HTTP error pages (which often have a text/html response body).

Usage

Retrieving an HTML resource as a DOM using XMLHttpRequest works just like retrieving an XML resource as a DOM using XMLHttpRequest, except you can't use the synchronous mode and you have to explicitly request a document by assigning the string "document" to the responseType property of the XMLHttpRequest object after calling open() but before calling send().

var xhr = new XMLHttpRequest();
xhr.onload = function() {
  console.log(this.responseXML.title);
}
xhr.open("GET", "file.html");
xhr.responseType = "document";
xhr.send();

Feature detection

Method 1

This method relies on the "force async" nature of the feature. When you try to set responseType of an XMLHttpRequest object after it is opened as "sync". This throws an error in the browsers that implement the feature and works on others.

function HTMLinXHR() {
  if (!window.XMLHttpRequest)
    return false;
  var req = new window.XMLHttpRequest();
  req.open('GET', window.location.href, false);
  try {
    req.responseType = 'document';
  } catch(e) {
    return true;
  }
  return false;
}

View on JSFiddle

This method is synchronous, does not rely on external assets though it may not be as reliable as method 2 described below since it does not check the actual feature but an indication of that feature.

Method 2

There are two challenges to detecting exactly if a browser supports HTML parsing in XMLHttpRequest. First, the detection result is obtained asynchronously, because HTML support is only available in the asynchronous mode. Second, you have to actually fetch a test document over HTTP, because testing with a data: URL would end up testing data: URL support at the same time.

Thus, to detect HTML support, a test HTML file is needed on the server. This test file is small and is not well-formed XML:

<title>&amp;&<</title>

If the file is named detect.html, the following function can be used for detecting HTML parsing support:

function detectHtmlInXhr(callback) {
  if (!window.XMLHttpRequest) {
    window.setTimeout(function() { callback(false); }, 0);
    return;
  }
  var done = false;
  var xhr = new window.XMLHttpRequest();
  xhr.onreadystatechange = function() {
    if (this.readyState == 4 && !done) {
      done = true;
      callback(!!(this.responseXML && this.responseXML.title && this.responseXML.title == "&&<"));
    }
  }
  xhr.onabort = xhr.onerror = function() {
    if (!done) {
      done = true;
      callback(false);
    }
  }
  try {
    xhr.open("GET", "detect.html");
    xhr.responseType = "document";
    xhr.send();
  } catch (e) {
    window.setTimeout(function() {
      if (!done) {
        done = true;
        callback(false);
      }
    }, 0);
  }
}

The argument callback is a function that will be called asynchronously with true as the only argument if HTML parsing is supported and false as the only argument if HTML parsing is not supported.

View on JSFiddle

Character encoding

If the character encoding is declared in the HTTP Content-Type header, that character encoding is used. Failing that, if there is a byte order mark, the encoding indicated by the byte order mark is used. Failing that, if there is a <meta> element that declares the encoding within the first 1024 bytes of the file, that encoding is used. Otherwise, the file is decoded as UTF-8.

Handling HTML on older browsers

XMLHttpRequest originally supported only XML parsing. HTML parsing support is a recent addition. For older browsers, you can even use the XMLHttpRequest.responseText property in association with regular expressions in order to get, for example, the source code of an HTML element given its ID:

function getHTML (oXHR, sTargetId) {
  var  rOpen = new RegExp("<(?!\!)\\s*([^\\s>]+)[^>]*\\s+id\\=[\"\']" + sTargetId + "[\"\'][^>]*>" ,"i"),
       sSrc = oXHR.responseText, aExec = rOpen.exec(sSrc);

  return aExec ? (new RegExp("(?:(?:.(?!<\\s*" + aExec[1] + "[^>]*[>]))*.?<\\s*" + aExec[1] + "[^>]*[>](?:.(?!<\\s*\/\\s*" + aExec[1] + "\\s*>))*.?<\\s*\/\\s*" + aExec[1] + "\\s*>)*(?:.(?!<\\s*\/\\s*" + aExec[1] + "\\s*>))*.?", "i")).exec(sSrc.slice(sSrc.indexOf(aExec[0]) + aExec[0].length)) || "" : "";
}

var oReq = new XMLHttpRequest();
oReq.open("GET", "yourPage.html", true);
oReq.onload = function () { console.log(getHTML(this, "intro")); };
oReq.send(null);
Note: This solution is very expensive for the interpreter. Use it only when it is really necessary.

Specifications

SpecificationStatusComment
XMLHttpRequestLiving StandardInitial definition

Browser compatibility

XMLHttpRequest interface

BCD tables only load in the browser

See also

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据

词条统计

浏览:95 次

字数:9456

最后编辑:7 年前

编辑次数:0 次

    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文