nsIParserUtils 编辑

parser/html/nsIParserUtils.idlScriptable Provides non-Web HTML parsing functionality to Firefox extensions and XULRunner applications. 1.0 66 Introduced Gecko 13.0 Inherits from: nsISupports Last changed in Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11) Warning: Do not use this from within Gecko--use nsContentUtils, nsTreeSanitizer, and so on directly instead.

Implemented by: @mozilla.org/parserutils;1 as a service:

var parserUtils = Components.classes["@mozilla.org/parserutils;1"]
                  .getService(Components.interfaces.nsIParserUtils);

Method overview

AString convertToPlainText(in AString src, in unsigned long flags, in unsigned long wrapCol);
nsIDOMDocumentFragment parseFragment(in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element);
AString sanitize(in AString src, in unsigned long flags);

Constants

ConstantValueDescription
SanitizerAllowComments(1 << 0)Flag for sanitizer: Allow comment nodes.
SanitizerAllowStyle(1 << 1)

Flag for sanitizer: Allow <style> elements and style attributes (with contents sanitized in case of -moz-binding).

Note: If -moz-binding is absent, properties that might be XSS risks in other Web engines are preserved!
SanitizerCidEmbedsOnly(1 << 2)

Flag for sanitizer: Only allow cid: URLs for embedded content.

At present, sanitizing CSS backgrounds, and so on., is not supported, so setting this together with SanitizerAllowStyle doesn't make sense.

At present, sanitizing CSS syntax in SVG presentational attributes is not supported, so this option flattens out SVG.
SanitizerDropNonCSSPresentation(1 << 3)Flag for sanitizer: Drops non-CSS presentational HTML elements and attributes, such as <font>, <center>, and the bgcolor attribute.
SanitizerDropForms(1 << 4)Flag for sanitizer: Drops forms and form controls (excluding <fieldset> and <legend>.
SanitizerDropMedia(1 << 5)Flag for sanitizer: Drops <img>, <video>, <audio>, and <source>, and flattens out SVG.
SanitizerLogRemovals(1 << 6)Flag for sanitizer: Log messages to the console for everything that gets sanitized.

Methods

convertToPlainText()

Converts HTML to plain text.

AString convertToPlainText(
  in AString src,
  in unsigned long flags,
  in unsigned long wrapCol
);
Parameters
src
The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
flags
Conversion option flags defined in nsIDocumentEncoder.
wrapCol
Number of characters per line; 0 for no auto-wrapping.
Return value

The plain text conversion of the HTML specified in src.

parseFragment()

Parses markup into a sanitized document fragment.

nsIDOMDocumentFragment parseFragment(
  in AString fragment,
  in unsigned long flags,
  in boolean isXML,
  in nsIURI baseURI,
  in nsIDOMElement element
);
Parameters
fragment
The input markup.
flags
Sanitization option flags defined above.
isXML
true if |fragment| is XML and false if HTML.
baseURI
The base URL for this fragment.
element
The context node for the fragment parsing algorithm.
Return value

An nsIDOMDocumentFragment object for the resulting sanitized document fragment.

sanitize()

Parses a string into an HTML document, sanitizes the document, and returns the result serialized to a string.

The sanitizer is designed to protect against XSS when sanitized content is inserted into a different-origin context without an iframe-equivalent sandboxing mechanism.

By default, the sanitizer doesn't try to avoid leaking information that the content was viewed to third parties. That is, by default, for example <img> with a source pointing to an HTTP server potentially controlled by a third party is not removed. To avoid ambient information leakage upon loading the sanitized content, use the SanitizerInternalEmbedsOnly flag. In that case, <a> links (and similar) to other content are preserved, so an explicit user action (following a link) after the content has been loaded can still leak information.

By default, non-dangerous non-CSS presentational HTML elements and attributes or forms are not removed. To remove these, use SanitizerDropNonCSSPresentation and/or SanitizerDropForms.

By default, comments and CSS is removed. To preserve comments, use SanitizerAllowComments. To preserve <style> elements and style attributes on other elements, use SanitizerAllowStyle. -moz-binding is removed from <style> elements and style attributes if present. In this case, properties that Gecko doesn't recognize can get removed as a side effect.

Note: If -moz-binding is not present, <style> elements and style attributes, and if SanitizerAllowStyle is specified, the sanitized content may still be XSS dangerous if loaded into a non-Gecko Web engine!
AString sanitize(
  in AString src,
  in unsigned long flags
);
Parameters
src
The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value).
flags
Sanitization option flags defined above.
Return value

The resulting text.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据

词条统计

浏览:121 次

字数:10107

最后编辑:7年前

编辑次数:0 次

    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文