nsIParserUtils 编辑
parser/html/nsIParserUtils.idl
Scriptable Provides non-Web HTML parsing functionality to Firefox extensions and XULRunner applications. 1.0 66 Introduced Gecko 13.0 Inherits from: nsISupports
Last changed in Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11) Warning: Do not use this from within Gecko--use nsContentUtils
, nsTreeSanitizer
, and so on directly instead.Implemented by: @mozilla.org/parserutils;1
as a service:
var parserUtils = Components.classes["@mozilla.org/parserutils;1
"]
.getService(Components.interfaces.nsIParserUtils);
Method overview
AString convertToPlainText(in AString src, in unsigned long flags, in unsigned long wrapCol); |
nsIDOMDocumentFragment parseFragment(in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element); |
AString sanitize(in AString src, in unsigned long flags); |
Constants
Constant | Value | Description |
SanitizerAllowComments | (1 << 0) | Flag for sanitizer: Allow comment nodes. |
SanitizerAllowStyle | (1 << 1) | Flag for sanitizer: Allow -moz-binding is absent, properties that might be XSS risks in other Web engines are preserved! |
SanitizerCidEmbedsOnly | (1 << 2) | Flag for sanitizer: Only allow cid: URLs for embedded content. At present, sanitizing CSS backgrounds, and so on., is not supported, so setting this together with |
SanitizerDropNonCSSPresentation | (1 << 3) | Flag for sanitizer: Drops non-CSS presentational HTML elements and attributes, such as <font> , <center> , and the bgcolor attribute. |
SanitizerDropForms | (1 << 4) | Flag for sanitizer: Drops forms and form controls (excluding <fieldset> and <legend> . |
SanitizerDropMedia | (1 << 5) | Flag for sanitizer: Drops <img> , <video> , <audio> , and <source> , and flattens out SVG. |
SanitizerLogRemovals | (1 << 6) | Flag for sanitizer: Log messages to the console for everything that gets sanitized. |
Methods
convertToPlainText()
Converts HTML to plain text.
AString convertToPlainText( in AString src, in unsigned long flags, in unsigned long wrapCol );
Parameters
src
- The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
flags
- Conversion option flags defined in
nsIDocumentEncoder
. wrapCol
- Number of characters per line; 0 for no auto-wrapping.
Return value
The plain text conversion of the HTML specified in src
.
parseFragment()
Parses markup into a sanitized document fragment.
nsIDOMDocumentFragment parseFragment( in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element );
Parameters
fragment
- The input markup.
flags
- Sanitization option flags defined above.
isXML
true
if |fragment| is XML andfalse
if HTML.baseURI
- The base URL for this fragment.
element
- The context node for the fragment parsing algorithm.
Return value
An nsIDOMDocumentFragment
object for the resulting sanitized document fragment.
sanitize()
Parses a string into an HTML document, sanitizes the document, and returns the result serialized to a string.
The sanitizer is designed to protect against XSS when sanitized content is inserted into a different-origin context without an iframe-equivalent sandboxing mechanism.
By default, the sanitizer doesn't try to avoid leaking information that the content was viewed to third parties. That is, by default, for example <img>
with a source pointing to an HTTP server potentially controlled by a third party is not removed. To avoid ambient information leakage upon loading the sanitized content, use the SanitizerInternalEmbedsOnly
flag. In that case, <a>
links (and similar) to other content are preserved, so an explicit user action (following a link) after the content has been loaded can still leak information.
By default, non-dangerous non-CSS presentational HTML elements and attributes or forms are not removed. To remove these, use SanitizerDropNonCSSPresentation
and/or SanitizerDropForms
.
By default, comments and CSS is removed. To preserve comments, use SanitizerAllowComments
. To preserve <style>
elements and style
attributes on other elements, use SanitizerAllowStyle
. -moz-binding
is removed from <style>
elements and style
attributes if present. In this case, properties that Gecko doesn't recognize can get removed as a side effect.
-moz-binding
is not present, <style>
elements and style
attributes, and if SanitizerAllowStyle
is specified, the sanitized content may still be XSS dangerous if loaded into a non-Gecko Web engine!AString sanitize( in AString src, in unsigned long flags );
Parameters
src
- The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value).
flags
- Sanitization option flags defined above.
Return value
The resulting text.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论