nsIParserUtils

Non-Web HTML parser functionality to Firefox extensions and XULRunner apps.
Don’t use this from within Gecko–use nsContentUtils, nsTreeSanitizer, etc.
directly instead.

Methods

sanitize(src, flags)

Parses a string into an HTML document, sanitizes the document and
returns the result serialized to a string.

The sanitizer is designed to protect against XSS when sanitized content
is inserted into a different-origin context without an iframe-equivalent
sandboxing mechanism.

By default, the sanitizer doesn’t try to avoid leaking information that
the content was viewed to third parties. That is, by default, e.g.
pointing to an HTTP server potentially controlled by a third
party is not removed. To avoid ambient information leakage upon loading
the sanitized content, use the SanitizerInternalEmbedsOnly flag. In that
case, links (and similar) to other content are preserved, so an
explicit user action (following a link) after the content has been loaded
can still leak information.

By default, non-dangerous non-CSS presentational HTML elements and
attributes or forms are not removed. To remove these, use
SanitizerDropNonCSSPresentation and/or SanitizerDropForms.

By default, comments and CSS is removed. To preserve comments, use
SanitizerAllowComments. To preserve

Parameters

src the HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
flags sanitization option flags defined above

convertToPlainText(src, flags, wrapCol)

Convert HTML to plain text.

Parameters

src the HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
flags conversion option flags defined in nsIDocumentEncoder
wrapCol number of characters per line; 0 for no auto-wrapping

parseFragment(fragment, flags, isXML, baseURI, element)

Parses markup into a sanitized document fragment.

Parameters

fragment the input markup
flags sanitization option flags defined above
isXML true if |fragment| is XML and false if HTML
baseURI the base URL for this fragment
element the context node for the fragment parsing algorithm

Constants

SanitizerAllowComments

Flag for sanitizer: Allow comment nodes.

SanitizerAllowStyle

Flag for sanitizer: Allow

SanitizerCidEmbedsOnly

Flag for sanitizer: Only allow cid: URLs for embedded content.

At present, sanitizing CSS backgrounds, etc., is not supported, so setting
this together with SanitizerAllowStyle doesn’t make sense.

At present, sanitizing CSS syntax in SVG presentational attributes is not
supported, so this option flattens out SVG.

SanitizerDropNonCSSPresentation

Flag for sanitizer: Drop non-CSS presentational HTML elements and
attributes, such as ,

and bgcolor="".

SanitizerDropForms

Flag for sanitizer: Drop forms and form controls (excluding
fieldset/legend).

SanitizerDropMedia

Flag for sanitizer: Drop ,