Back to index

plone3  3.1.7
Public Member Functions | Public Attributes | Static Public Attributes
plone.app.portlets.portlets.feedparser._HTMLSanitizer Class Reference
Inheritance diagram for plone.app.portlets.portlets.feedparser._HTMLSanitizer:
Inheritance graph
[legend]
Collaboration diagram for plone.app.portlets.portlets.feedparser._HTMLSanitizer:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def reset
def unknown_starttag
def unknown_endtag
def handle_pi
def handle_decl
def handle_data
def feed
def normalize_attrs
def handle_charref
def handle_entityref
def handle_comment
def output

Public Attributes

 unacceptablestack
 encoding
 pieces

Static Public Attributes

list acceptable_elements
list acceptable_attributes
list unacceptable_elements_with_end_tag = ['script', 'applet']
list elements_no_end_tag

Detailed Description

Definition at line 1597 of file feedparser.py.


Member Function Documentation

Definition at line 1433 of file feedparser.py.

01433 
01434     def feed(self, data):
01435         data = re.compile(r'<!((?!DOCTYPE|--|\[))', re.IGNORECASE).sub(r'&lt;!\1', data)
01436         #data = re.sub(r'<(\S+?)\s*?/>', self._shorttag_replace, data) # bug [ 1399464 ] Bad regexp for _shorttag_replace
01437         data = re.sub(r'<([^<\s]+?)\s*/>', self._shorttag_replace, data) 
01438         data = data.replace('&#39;', "'")
01439         data = data.replace('&#34;', '"')
01440         if self.encoding and type(data) == type(u''):
01441             data = data.encode(self.encoding)
01442         sgmllib.SGMLParser.feed(self, data)

Here is the call graph for this function:

Definition at line 1472 of file feedparser.py.

01472 
01473     def handle_charref(self, ref):
01474         # called for each character reference, e.g. for '&#160;', ref will be '160'
01475         # Reconstruct the original character reference.
01476         self.pieces.append('&#%(ref)s;' % locals())
        

Definition at line 1489 of file feedparser.py.

01489 
01490     def handle_comment(self, text):
01491         # called for each HTML comment, e.g. <!-- insert Javascript code here -->
01492         # Reconstruct the original comment.
01493         self.pieces.append('<!--%(text)s-->' % locals())
        

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1646 of file feedparser.py.

01646 
01647     def handle_data(self, text):
01648         if not self.unacceptablestack:
01649             _BaseHTMLProcessor.handle_data(self, text)

Here is the caller graph for this function:

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1643 of file feedparser.py.

01643 
01644     def handle_decl(self, text):
01645         pass

Definition at line 1477 of file feedparser.py.

01477 
01478     def handle_entityref(self, ref):
01479         # called for each entity reference, e.g. for '&copy;', ref will be 'copy'
01480         # Reconstruct the original entity reference.
01481         self.pieces.append('&%(ref)s;' % locals())

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1640 of file feedparser.py.

01640 
01641     def handle_pi(self, text):
01642         pass

Definition at line 1443 of file feedparser.py.

01443 
01444     def normalize_attrs(self, attrs):
01445         # utility method to be called by descendants
01446         attrs = [(k.lower(), v) for k, v in attrs]
01447         attrs = [(k, k in ('rel', 'type') and v.lower() or v) for k, v in attrs]
01448         return attrs

Here is the caller graph for this function:

Return processed HTML as a single string

Definition at line 1524 of file feedparser.py.

01524 
01525     def output(self):
01526         '''Return processed HTML as a single string'''
01527         return ''.join([str(p) for p in self.pieces])

Here is the caller graph for this function:

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1620 of file feedparser.py.

01620 
01621     def reset(self):
01622         _BaseHTMLProcessor.reset(self)
01623         self.unacceptablestack = 0
        

Here is the caller graph for this function:

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1633 of file feedparser.py.

01633 
01634     def unknown_endtag(self, tag):
01635         if not tag in self.acceptable_elements:
01636             if tag in self.unacceptable_elements_with_end_tag:
01637                 self.unacceptablestack -= 1
01638             return
01639         _BaseHTMLProcessor.unknown_endtag(self, tag)

Reimplemented from plone.app.portlets.portlets.feedparser._BaseHTMLProcessor.

Definition at line 1624 of file feedparser.py.

01624 
01625     def unknown_starttag(self, tag, attrs):
01626         if not tag in self.acceptable_elements:
01627             if tag in self.unacceptable_elements_with_end_tag:
01628                 self.unacceptablestack += 1
01629             return
01630         attrs = self.normalize_attrs(attrs)
01631         attrs = [(key, value) for key, value in attrs if key in self.acceptable_attributes]
01632         _BaseHTMLProcessor.unknown_starttag(self, tag, attrs)
        

Here is the call graph for this function:


Member Data Documentation

Initial value:
['abbr', 'accept', 'accept-charset', 'accesskey',
      'action', 'align', 'alt', 'axis', 'border', 'cellpadding', 'cellspacing',
      'char', 'charoff', 'charset', 'checked', 'cite', 'class', 'clear', 'cols',
      'colspan', 'color', 'compact', 'coords', 'datetime', 'dir', 'disabled',
      'enctype', 'for', 'frame', 'headers', 'height', 'href', 'hreflang', 'hspace',
      'id', 'ismap', 'label', 'lang', 'longdesc', 'maxlength', 'media', 'method',
      'multiple', 'name', 'nohref', 'noshade', 'nowrap', 'prompt', 'readonly',
      'rel', 'rev', 'rows', 'rowspan', 'rules', 'scope', 'selected', 'shape', 'size',
      'span', 'src', 'start', 'summary', 'tabindex', 'target', 'title', 'type',
      'usemap', 'valign', 'value', 'vspace', 'width']

Definition at line 1607 of file feedparser.py.

Initial value:
['a', 'abbr', 'acronym', 'address', 'area', 'b', 'big',
      'blockquote', 'br', 'button', 'caption', 'center', 'cite', 'code', 'col',
      'colgroup', 'dd', 'del', 'dfn', 'dir', 'div', 'dl', 'dt', 'em', 'fieldset',
      'font', 'form', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'img', 'input',
      'ins', 'kbd', 'label', 'legend', 'li', 'map', 'menu', 'ol', 'optgroup',
      'option', 'p', 'pre', 'q', 's', 'samp', 'select', 'small', 'span', 'strike',
      'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'textarea', 'tfoot', 'th',
      'thead', 'tr', 'tt', 'u', 'ul', 'var']

Definition at line 1598 of file feedparser.py.

Initial value:
['area', 'base', 'basefont', 'br', 'col', 'frame', 'hr',
      'img', 'input', 'isindex', 'link', 'meta', 'param']

Definition at line 1414 of file feedparser.py.

Definition at line 1418 of file feedparser.py.

Definition at line 1423 of file feedparser.py.

Definition at line 1618 of file feedparser.py.

Definition at line 1622 of file feedparser.py.


The documentation for this class was generated from the following file: