Back to index

plone3  3.1.7
Public Member Functions | Public Attributes | Static Public Attributes
kss.core.BeautifulSoup.BeautifulSOAP Class Reference
Inheritance diagram for kss.core.BeautifulSoup.BeautifulSOAP:
Inheritance graph
[legend]
Collaboration diagram for kss.core.BeautifulSoup.BeautifulSOAP:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def popTag
def __getattr__
def isSelfClosingTag
def reset
def pushTag
def endData
def unknown_starttag
def unknown_endtag
def handle_data
def handle_pi
def handle_comment
def handle_charref
def handle_entityref
def handle_decl
def parse_declaration
def setup
def replaceWith
def extract
def insert
def findNext
def findAllNext
def findNextSibling
def findNextSiblings
def findPrevious
def findAllPrevious
def findPreviousSibling
def findPreviousSiblings
def findParent
def findParents
def nextGenerator
def nextSiblingGenerator
def previousGenerator
def previousSiblingGenerator
def parentGenerator
def substituteEncoding
def toEncoding

Public Attributes

 HTML_ENTITIES
 XML_ENTITIES
 parseOnlyThese
 fromEncoding
 smartQuotesTo
 convertHTMLEntities
 convertXMLEntities
 instanceSelfClosingTags
 markup
 markupMassage
 originalEncoding
 hidden
 currentData
 currentTag
 tagStack
 quoteStack
 previous
 literal
 parent
 next
 previousSibling
 nextSibling

Static Public Attributes

dictionary SELF_CLOSING_TAGS = {}
dictionary NESTABLE_TAGS = {}
dictionary RESET_NESTING_TAGS = {}
dictionary QUOTE_TAGS = {}
list MARKUP_MASSAGE
string ROOT_TAG_NAME = u'[document]'
string HTML_ENTITIES = "html"
string XML_ENTITIES = "xml"
list ALL_ENTITIES = [HTML_ENTITIES, XML_ENTITIES]
 fetchNextSiblings = findNextSiblings
 fetchPrevious = findAllPrevious
 fetchPreviousSiblings = findPreviousSiblings
 fetchParents = findParents

Detailed Description

This class will push a tag with only a single string child into
the tag's parent as an attribute. The attribute's name is the tag
name, and the value is the string child. An example should give
the flavor of the change:

<foo><bar>baz</bar></foo>
 =>
<foo bar="baz"><bar>baz</bar></foo>

You can then access fooTag['bar'] instead of fooTag.barTag.string.

This is, of course, useful for scraping structures that tend to
use subelements instead of attributes, such as SOAP messages. Note
that it modifies its input, so don't print the modified version
out.

I'm not sure how many people really want to use this class; let me
know if you do. Mainly I like the name.

Definition at line 1474 of file BeautifulSoup.py.


Member Function Documentation

def kss.core.BeautifulSoup.BeautifulStoneSoup.__getattr__ (   self,
  methodName 
) [inherited]
This method routes method call requests to either the SGMLParser
superclass or the Tag superclass, depending on the method name.

Definition at line 1005 of file BeautifulSoup.py.

01005 
01006     def __getattr__(self, methodName):
01007         """This method routes method call requests to either the SGMLParser
01008         superclass or the Tag superclass, depending on the method name."""
01009         #print "__getattr__ called on %s.%s" % (self.__class__, methodName)
01010 
01011         if methodName.find('start_') == 0 or methodName.find('end_') == 0 \
01012                or methodName.find('do_') == 0:
01013             return SGMLParser.__getattr__(self, methodName)
01014         elif methodName.find('__') != 0:
01015             return Tag.__getattr__(self, methodName)
01016         else:
01017             raise AttributeError

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.endData (   self,
  containerClass = NavigableString 
) [inherited]

Definition at line 1055 of file BeautifulSoup.py.

01055 
01056     def endData(self, containerClass=NavigableString):
01057         if self.currentData:
01058             currentData = ''.join(self.currentData)
01059             if currentData.endswith('<') and self.convertHTMLEntities:
01060                 currentData = currentData[:-1] + '&lt;'
01061             if not currentData.strip():
01062                 if '\n' in currentData:
01063                     currentData = '\n'
01064                 else:
01065                     currentData = ' '
01066             self.currentData = []
01067             if self.parseOnlyThese and len(self.tagStack) <= 1 and \
01068                    (not self.parseOnlyThese.text or \
01069                     not self.parseOnlyThese.search(currentData)):
01070                 return
01071             o = containerClass(currentData)
01072             o.setup(self.currentTag, self.previous)
01073             if self.previous:
01074                 self.previous.next = o
01075             self.previous = o
01076             self.currentTag.contents.append(o)
01077 

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.extract (   self) [inherited]
Destructively rips this element out of the tree.

Definition at line 102 of file BeautifulSoup.py.

00102 
00103     def extract(self):
00104         """Destructively rips this element out of the tree."""        
00105         if self.parent:
00106             try:
00107                 self.parent.contents.remove(self)
00108             except ValueError:
00109                 pass
00110 
00111         #Find the two elements that would be next to each other if
00112         #this element (and any children) hadn't been parsed. Connect
00113         #the two.        
00114         lastChild = self._lastRecursiveChild()
00115         nextElement = lastChild.next
00116 
00117         if self.previous:
00118             self.previous.next = nextElement
00119         if nextElement:
00120             nextElement.previous = self.previous
00121         self.previous = None
00122         lastChild.next = None
00123 
00124         self.parent = None        
00125         if self.previousSibling:
00126             self.previousSibling.nextSibling = self.nextSibling
00127         if self.nextSibling:
00128             self.nextSibling.previousSibling = self.previousSibling
00129         self.previousSibling = self.nextSibling = None       

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.findAllNext (   self,
  name = None,
  attrs = {},
  text = None,
  limit = None,
  kwargs 
) [inherited]
Returns all items that match the given criteria and appear
before after Tag in the document.

Definition at line 203 of file BeautifulSoup.py.

00203 
00204                     **kwargs):
00205         """Returns all items that match the given criteria and appear
00206         before after Tag in the document."""
00207         return self._findAll(name, attrs, text, limit, self.nextGenerator)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.findAllPrevious (   self,
  name = None,
  attrs = {},
  text = None,
  limit = None,
  kwargs 
) [inherited]
Returns all items that match the given criteria and appear
before this Tag in the document.

Definition at line 228 of file BeautifulSoup.py.

00228 
00229                         **kwargs):
00230         """Returns all items that match the given criteria and appear
00231         before this Tag in the document."""
00232         return self._findAll(name, attrs, text, limit, self.previousGenerator,
                           **kwargs)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.findNext (   self,
  name = None,
  attrs = {},
  text = None,
  kwargs 
) [inherited]
Returns the first item that matches the given criteria and
appears after this Tag in the document.

Definition at line 197 of file BeautifulSoup.py.

00197 
00198     def findNext(self, name=None, attrs={}, text=None, **kwargs):
00199         """Returns the first item that matches the given criteria and
00200         appears after this Tag in the document."""
00201         return self._findOne(self.findAllNext, name, attrs, text, **kwargs)

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.findNextSibling (   self,
  name = None,
  attrs = {},
  text = None,
  kwargs 
) [inherited]
Returns the closest sibling to this Tag that matches the
given criteria and appears after this Tag in the document.

Definition at line 208 of file BeautifulSoup.py.

00208 
00209     def findNextSibling(self, name=None, attrs={}, text=None, **kwargs):
00210         """Returns the closest sibling to this Tag that matches the
00211         given criteria and appears after this Tag in the document."""
00212         return self._findOne(self.findNextSiblings, name, attrs, text,
00213                              **kwargs)

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.findNextSiblings (   self,
  name = None,
  attrs = {},
  text = None,
  limit = None,
  kwargs 
) [inherited]
Returns the siblings of this Tag that match the given
criteria and appear after this Tag in the document.

Definition at line 215 of file BeautifulSoup.py.

00215 
00216                          **kwargs):
00217         """Returns the siblings of this Tag that match the given
00218         criteria and appear after this Tag in the document."""
00219         return self._findAll(name, attrs, text, limit,
                             self.nextSiblingGenerator, **kwargs)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.findParent (   self,
  name = None,
  attrs = {},
  kwargs 
) [inherited]
Returns the closest parent of this Tag that matches the given
criteria.

Definition at line 249 of file BeautifulSoup.py.

00249 
00250     def findParent(self, name=None, attrs={}, **kwargs):
00251         """Returns the closest parent of this Tag that matches the given
00252         criteria."""
00253         # NOTE: We can't use _findOne because findParents takes a different
00254         # set of arguments.
00255         r = None
00256         l = self.findParents(name, attrs, 1)
00257         if l:
00258             r = l[0]
00259         return r

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.findParents (   self,
  name = None,
  attrs = {},
  limit = None,
  kwargs 
) [inherited]
Returns the parents of this Tag that match the given
criteria.

Definition at line 260 of file BeautifulSoup.py.

00260 
00261     def findParents(self, name=None, attrs={}, limit=None, **kwargs):
00262         """Returns the parents of this Tag that match the given
00263         criteria."""
00264 
00265         return self._findAll(name, attrs, None, limit, self.parentGenerator,
                             **kwargs)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.findPrevious (   self,
  name = None,
  attrs = {},
  text = None,
  kwargs 
) [inherited]
Returns the first item that matches the given criteria and
appears before this Tag in the document.

Definition at line 222 of file BeautifulSoup.py.

00222 
00223     def findPrevious(self, name=None, attrs={}, text=None, **kwargs):
00224         """Returns the first item that matches the given criteria and
00225         appears before this Tag in the document."""
00226         return self._findOne(self.findAllPrevious, name, attrs, text, **kwargs)

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.findPreviousSibling (   self,
  name = None,
  attrs = {},
  text = None,
  kwargs 
) [inherited]
Returns the closest sibling to this Tag that matches the
given criteria and appears before this Tag in the document.

Definition at line 235 of file BeautifulSoup.py.

00235 
00236     def findPreviousSibling(self, name=None, attrs={}, text=None, **kwargs):
00237         """Returns the closest sibling to this Tag that matches the
00238         given criteria and appears before this Tag in the document."""
00239         return self._findOne(self.findPreviousSiblings, name, attrs, text,
00240                              **kwargs)

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.findPreviousSiblings (   self,
  name = None,
  attrs = {},
  text = None,
  limit = None,
  kwargs 
) [inherited]
Returns the siblings of this Tag that match the given
criteria and appear before this Tag in the document.

Definition at line 242 of file BeautifulSoup.py.

00242 
00243                              limit=None, **kwargs):
00244         """Returns the siblings of this Tag that match the given
00245         criteria and appear before this Tag in the document."""
00246         return self._findAll(name, attrs, text, limit,
                             self.previousSiblingGenerator, **kwargs)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_charref (   self,
  ref 
) [inherited]

Definition at line 1219 of file BeautifulSoup.py.

01219 
01220     def handle_charref(self, ref):
01221         "Handle character references as data."
01222         if ref[0] == 'x':
01223             data = unichr(int(ref[1:],16))
01224         else:
01225             data = unichr(int(ref))
01226         
01227         if u'\x80' <= data <= u'\x9F':
01228             data = UnicodeDammit.subMSChar(chr(ord(data)), self.smartQuotesTo)
01229         elif not self.convertHTMLEntities and not self.convertXMLEntities:
01230             data = '&#%s;' % ref
01231 
01232         self.handle_data(data)

Here is the call graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_comment (   self,
  text 
) [inherited]

Definition at line 1215 of file BeautifulSoup.py.

01215 
01216     def handle_comment(self, text):
01217         "Handle comments as Comment objects."
01218         self._toStringSubclass(text, Comment)

Here is the call graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_data (   self,
  data 
) [inherited]

Definition at line 1190 of file BeautifulSoup.py.

01190 
01191     def handle_data(self, data):
01192         if self.convertHTMLEntities:
01193             if data[0] == '&':
01194                 data = self.BARE_AMPERSAND.sub("&amp;",data)
01195             else:
01196                 data = data.replace('&','&amp;') \
01197                            .replace('<','&lt;') \
01198                            .replace('>','&gt;')
01199         self.currentData.append(data)

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_decl (   self,
  data 
) [inherited]

Definition at line 1251 of file BeautifulSoup.py.

01251 
01252     def handle_decl(self, data):
01253         "Handle DOCTYPEs and the like as Declaration objects."
01254         self._toStringSubclass(data, Declaration)

Here is the call graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_entityref (   self,
  ref 
) [inherited]
Handle entity references as data, possibly converting known
HTML entity references to the corresponding Unicode
characters.

Definition at line 1233 of file BeautifulSoup.py.

01233 
01234     def handle_entityref(self, ref):
01235         """Handle entity references as data, possibly converting known
01236         HTML entity references to the corresponding Unicode
01237         characters."""
01238         replaceWithXMLEntity = self.convertXMLEntities and \
01239                                self.XML_ENTITIES_TO_CHARS.has_key(ref)
01240         if self.convertHTMLEntities or replaceWithXMLEntity:
01241             try:
01242                 data = unichr(name2codepoint[ref])
01243             except KeyError:
01244                 if replaceWithXMLEntity:
01245                     data = self.XML_ENTITIES_TO_CHARS.get(ref)
01246                 else:
01247                     data="&amp;%s" % ref
01248         else:
01249             data = '&%s;' % ref
01250         self.handle_data(data)
        

Here is the call graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.handle_pi (   self,
  text 
) [inherited]
Handle a processing instruction as a ProcessingInstruction
object, possibly one with a %SOUP-ENCODING% slot into which an
encoding will be plugged later.

Definition at line 1207 of file BeautifulSoup.py.

01207 
01208     def handle_pi(self, text):
01209         """Handle a processing instruction as a ProcessingInstruction
01210         object, possibly one with a %SOUP-ENCODING% slot into which an
01211         encoding will be plugged later."""
01212         if text[:3] == "xml":
01213             text = "xml version='1.0' encoding='%SOUP-ENCODING%'"
01214         self._toStringSubclass(text, ProcessingInstruction)

Here is the call graph for this function:

def kss.core.BeautifulSoup.PageElement.insert (   self,
  position,
  newChild 
) [inherited]

Definition at line 137 of file BeautifulSoup.py.

00137 
00138     def insert(self, position, newChild):
00139         if (isinstance(newChild, basestring)
00140             or isinstance(newChild, unicode)) \
00141             and not isinstance(newChild, NavigableString):
00142             newChild = NavigableString(newChild)        
00143 
00144         position =  min(position, len(self.contents))
00145         if hasattr(newChild, 'parent') and newChild.parent != None:
00146             # We're 'inserting' an element that's already one
00147             # of this object's children. 
00148             if newChild.parent == self:
00149                 index = self.find(newChild)
00150                 if index and index < position:
00151                     # Furthermore we're moving it further down the
00152                     # list of this object's children. That means that
00153                     # when we extract this element, our target index
00154                     # will jump down one.
00155                     position = position - 1
00156             newChild.extract()
00157             
00158         newChild.parent = self
00159         previousChild = None
00160         if position == 0:
00161             newChild.previousSibling = None
00162             newChild.previous = self
00163         else:
00164             previousChild = self.contents[position-1]
00165             newChild.previousSibling = previousChild
00166             newChild.previousSibling.nextSibling = newChild
00167             newChild.previous = previousChild._lastRecursiveChild()
00168         if newChild.previous:
00169             newChild.previous.next = newChild        
00170 
00171         newChildsLastElement = newChild._lastRecursiveChild()
00172 
00173         if position >= len(self.contents):
00174             newChild.nextSibling = None
00175             
00176             parent = self
00177             parentsNextSibling = None
00178             while not parentsNextSibling:
00179                 parentsNextSibling = parent.nextSibling
00180                 parent = parent.parent
00181                 if not parent: # This is the last element in the document.
00182                     break
00183             if parentsNextSibling:
00184                 newChildsLastElement.next = parentsNextSibling
00185             else:
00186                 newChildsLastElement.next = None
00187         else:
00188             nextChild = self.contents[position]            
00189             newChild.nextSibling = nextChild            
00190             if newChild.nextSibling:
00191                 newChild.nextSibling.previousSibling = newChild
00192             newChildsLastElement.next = nextChild
00193 
00194         if newChildsLastElement.next:
00195             newChildsLastElement.next.previous = newChildsLastElement
00196         self.contents.insert(position, newChild)

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.isSelfClosingTag (   self,
  name 
) [inherited]
Returns true iff the given string is the name of a
self-closing tag according to this parser.

Definition at line 1018 of file BeautifulSoup.py.

01018 
01019     def isSelfClosingTag(self, name):
01020         """Returns true iff the given string is the name of a
01021         self-closing tag according to this parser."""
01022         return self.SELF_CLOSING_TAGS.has_key(name) \
01023                or self.instanceSelfClosingTags.has_key(name)
            

Here is the caller graph for this function:

Definition at line 302 of file BeautifulSoup.py.

00302 
00303     def nextGenerator(self):
00304         i = self
00305         while i:
00306             i = i.next
00307             yield i

Here is the caller graph for this function:

Definition at line 308 of file BeautifulSoup.py.

00308 
00309     def nextSiblingGenerator(self):
00310         i = self
00311         while i:
00312             i = i.nextSibling
00313             yield i

Here is the caller graph for this function:

Definition at line 326 of file BeautifulSoup.py.

00326 
00327     def parentGenerator(self):
00328         i = self
00329         while i:
00330             i = i.parent
00331             yield i

Here is the caller graph for this function:

Treat a bogus SGML declaration as raw data. Treat a CDATA
declaration as a CData object.

Definition at line 1255 of file BeautifulSoup.py.

01255 
01256     def parse_declaration(self, i):
01257         """Treat a bogus SGML declaration as raw data. Treat a CDATA
01258         declaration as a CData object."""
01259         j = None
01260         if self.rawdata[i:i+9] == '<![CDATA[':
01261              k = self.rawdata.find(']]>', i)
01262              if k == -1:
01263                  k = len(self.rawdata)
01264              data = self.rawdata[i+9:k]
01265              j = k+3
01266              self._toStringSubclass(data, CData)
01267         else:
01268             try:
01269                 j = SGMLParser.parse_declaration(self, i)
01270             except SGMLParseError:
01271                 toHandle = self.rawdata[i:]
01272                 self.handle_data(toHandle)
01273                 j = i + len(toHandle)
01274         return j

Here is the call graph for this function:

Reimplemented from kss.core.BeautifulSoup.BeautifulStoneSoup.

Definition at line 1494 of file BeautifulSoup.py.

01494 
01495     def popTag(self):
01496         if len(self.tagStack) > 1:
01497             tag = self.tagStack[-1]
01498             parent = self.tagStack[-2]
01499             parent._getAttrMap()
01500             if (isinstance(tag, Tag) and len(tag.contents) == 1 and
01501                 isinstance(tag.contents[0], NavigableString) and 
01502                 not parent.attrMap.has_key(tag.name)):
01503                 parent[tag.name] = tag.contents[0]
01504         BeautifulStoneSoup.popTag(self)
01505 
01506 #Enterprise class names! It has come to our attention that some people
01507 #think the names of the Beautiful Soup parser classes are too silly
01508 #and "unprofessional" for use in enterprise screen-scraping. We feel
01509 #your pain! For such-minded folk, the Beautiful Soup Consortium And
01510 #All-Night Kosher Bakery recommends renaming this file to
01511 #"RobustParser.py" (or, in cases of extreme enterprisitude,
01512 #"RobustParserBeanInterface.class") and using the following
#enterprise-friendly class aliases:

Definition at line 314 of file BeautifulSoup.py.

00314 
00315     def previousGenerator(self):
00316         i = self
00317         while i:
00318             i = i.previous
00319             yield i

Here is the caller graph for this function:

Definition at line 320 of file BeautifulSoup.py.

00320 
00321     def previousSiblingGenerator(self):
00322         i = self
00323         while i:
00324             i = i.previousSibling
00325             yield i

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.pushTag (   self,
  tag 
) [inherited]

Definition at line 1048 of file BeautifulSoup.py.

01048 
01049     def pushTag(self, tag):
01050         #print "Push", tag.name
01051         if self.currentTag:
01052             self.currentTag.append(tag)
01053         self.tagStack.append(tag)
01054         self.currentTag = self.tagStack[-1]

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.replaceWith (   self,
  replaceWith 
) [inherited]

Definition at line 88 of file BeautifulSoup.py.

00088 
00089     def replaceWith(self, replaceWith):        
00090         oldParent = self.parent
00091         myIndex = self.parent.contents.index(self)
00092         if hasattr(replaceWith, 'parent') and replaceWith.parent == self.parent:
00093             # We're replacing this element with one of its siblings.
00094             index = self.parent.contents.index(replaceWith)
00095             if index and index < myIndex:
00096                 # Furthermore, it comes before this element. That
00097                 # means that when we extract it, the index of this
00098                 # element will change.
00099                 myIndex = myIndex - 1
00100         self.extract()        
00101         oldParent.insert(myIndex, replaceWith)
        

Here is the call graph for this function:

Definition at line 1024 of file BeautifulSoup.py.

01024 
01025     def reset(self):
01026         Tag.__init__(self, self, self.ROOT_TAG_NAME)
01027         self.hidden = 1
01028         SGMLParser.reset(self)
01029         self.currentData = []
01030         self.currentTag = None
01031         self.tagStack = []
01032         self.quoteStack = []
01033         self.pushTag(self)
    

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.setup (   self,
  parent = None,
  previous = None 
) [inherited]
Sets up the initial relations between this element and
other elements.

Definition at line 76 of file BeautifulSoup.py.

00076 
00077     def setup(self, parent=None, previous=None):
00078         """Sets up the initial relations between this element and
00079         other elements."""        
00080         self.parent = parent
00081         self.previous = previous
00082         self.next = None
00083         self.previousSibling = None
00084         self.nextSibling = None
00085         if self.parent and self.parent.contents:
00086             self.previousSibling = self.parent.contents[-1]
00087             self.previousSibling.nextSibling = self

def kss.core.BeautifulSoup.PageElement.substituteEncoding (   self,
  str,
  encoding = None 
) [inherited]

Definition at line 333 of file BeautifulSoup.py.

00333 
00334     def substituteEncoding(self, str, encoding=None):
00335         encoding = encoding or "utf-8"
00336         return str.replace("%SOUP-ENCODING%", encoding)    

Here is the caller graph for this function:

def kss.core.BeautifulSoup.PageElement.toEncoding (   self,
  s,
  encoding = None 
) [inherited]
Encodes an object to a string in some encoding, or to Unicode.
.

Definition at line 337 of file BeautifulSoup.py.

00337 
00338     def toEncoding(self, s, encoding=None):
00339         """Encodes an object to a string in some encoding, or to Unicode.
00340         ."""
00341         if isinstance(s, unicode):
00342             if encoding:
00343                 s = s.encode(encoding)
00344         elif isinstance(s, str):
00345             if encoding:
00346                 s = s.encode(encoding)
00347             else:
00348                 s = unicode(s)
00349         else:
00350             if encoding:
00351                 s  = self.toEncoding(str(s), encoding)
00352             else:
00353                 s = unicode(s)
00354         return s

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.unknown_endtag (   self,
  name 
) [inherited]

Definition at line 1177 of file BeautifulSoup.py.

01177 
01178     def unknown_endtag(self, name):
01179         #print "End tag %s" % name
01180         if self.quoteStack and self.quoteStack[-1] != name:
01181             #This is not a real end tag.
01182             #print "</%s> is not real!" % name
01183             self.currentData.append('</%s>' % name)
01184             return
01185         self.endData()
01186         self._popToTag(name)
01187         if self.quoteStack and self.quoteStack[-1] == name:
01188             self.quoteStack.pop()
01189             self.literal = (len(self.quoteStack) > 0)

Here is the call graph for this function:

Here is the caller graph for this function:

def kss.core.BeautifulSoup.BeautifulStoneSoup.unknown_starttag (   self,
  name,
  attrs,
  selfClosing = 0 
) [inherited]

Definition at line 1147 of file BeautifulSoup.py.

01147 
01148     def unknown_starttag(self, name, attrs, selfClosing=0):
01149         #print "Start tag %s: %s" % (name, attrs)
01150         if self.quoteStack:
01151             #This is not a real tag.
01152             #print "<%s> is not real!" % name
01153             attrs = ''.join(map(lambda(x, y): ' %s="%s"' % (x, y), attrs))
01154             self.currentData.append('<%s%s>' % (name, attrs))
01155             return        
01156         self.endData()
01157 
01158         if not self.isSelfClosingTag(name) and not selfClosing:
01159             self._smartPop(name)
01160 
01161         if self.parseOnlyThese and len(self.tagStack) <= 1 \
01162                and (self.parseOnlyThese.text or not self.parseOnlyThese.searchTag(name, attrs)):
01163             return
01164 
01165         tag = Tag(self, name, attrs, self.currentTag, self.previous)
01166         if self.previous:
01167             self.previous.next = tag
01168         self.previous = tag
01169         self.pushTag(tag)
01170         if selfClosing or self.isSelfClosingTag(name):
01171             self.popTag()                
01172         if name in self.QUOTE_TAGS:
01173             #print "Beginning quote (%s)" % name
01174             self.quoteStack.append(name)
01175             self.literal = 1
01176         return tag

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 918 of file BeautifulSoup.py.

Definition at line 959 of file BeautifulSoup.py.

Definition at line 960 of file BeautifulSoup.py.

Definition at line 1028 of file BeautifulSoup.py.

Definition at line 1029 of file BeautifulSoup.py.

Definition at line 220 of file BeautifulSoup.py.

Definition at line 266 of file BeautifulSoup.py.

Definition at line 233 of file BeautifulSoup.py.

Definition at line 247 of file BeautifulSoup.py.

Definition at line 949 of file BeautifulSoup.py.

Definition at line 1026 of file BeautifulSoup.py.

string kss.core.BeautifulSoup.BeautifulStoneSoup.HTML_ENTITIES = "html" [static, inherited]

Definition at line 916 of file BeautifulSoup.py.

Definition at line 962 of file BeautifulSoup.py.

Definition at line 965 of file BeautifulSoup.py.

Definition at line 1174 of file BeautifulSoup.py.

Definition at line 970 of file BeautifulSoup.py.

Initial value:
[(re.compile('(<[^<>]*)/>'),
                       lambda x: x.group(1) + ' />'),
                      (re.compile('<!\s+([^<>]*)>'),
                       lambda x: '<!' + x.group(1) + '>')
                      ]

Definition at line 908 of file BeautifulSoup.py.

Definition at line 971 of file BeautifulSoup.py.

dictionary kss.core.BeautifulSoup.BeautifulStoneSoup.NESTABLE_TAGS = {} [static, inherited]

Definition at line 81 of file BeautifulSoup.py.

Definition at line 83 of file BeautifulSoup.py.

Reimplemented in kss.core.BeautifulSoup.BeautifulSoup.

Definition at line 983 of file BeautifulSoup.py.

Definition at line 79 of file BeautifulSoup.py.

Definition at line 948 of file BeautifulSoup.py.

Reimplemented from kss.core.BeautifulSoup.PageElement.

Definition at line 1074 of file BeautifulSoup.py.

Definition at line 82 of file BeautifulSoup.py.

dictionary kss.core.BeautifulSoup.BeautifulStoneSoup.QUOTE_TAGS = {} [static, inherited]

Reimplemented in kss.core.BeautifulSoup.BeautifulSoup.

Definition at line 906 of file BeautifulSoup.py.

Definition at line 1031 of file BeautifulSoup.py.

string kss.core.BeautifulSoup.BeautifulStoneSoup.ROOT_TAG_NAME = u'[document]' [static, inherited]

Definition at line 914 of file BeautifulSoup.py.

Reimplemented in kss.core.BeautifulSoup.BeautifulSoup.

Definition at line 903 of file BeautifulSoup.py.

Definition at line 950 of file BeautifulSoup.py.

Definition at line 1030 of file BeautifulSoup.py.

string kss.core.BeautifulSoup.BeautifulStoneSoup.XML_ENTITIES = "xml" [static, inherited]

Definition at line 917 of file BeautifulSoup.py.

Definition at line 963 of file BeautifulSoup.py.


The documentation for this class was generated from the following file: