Back to index

python3.2  3.2.2
Public Member Functions | Public Attributes
lib2to3.pgen2.parse.Parser Class Reference
Inheritance diagram for lib2to3.pgen2.parse.Parser:
Inheritance graph
[legend]
Collaboration diagram for lib2to3.pgen2.parse.Parser:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def __init__
def setup
def addtoken
def classify
def shift
def push
def pop

Public Attributes

 grammar
 convert
 stack
 rootnode
 used_names
_PyObject_HEAD_EXTRA Py_ssize_t ob_refcnt
struct _typeobjectob_type

Detailed Description

Parser engine.

The proper usage sequence is:

p = Parser(grammar, [converter])  # create instance
p.setup([start])                  # prepare for parsing
<for each input token>:
    if p.addtoken(...):           # parse a token; may raise ParseError
        break
root = p.rootnode                 # root of abstract syntax tree

A Parser instance may be reused by calling setup() repeatedly.

A Parser instance contains state pertaining to the current token
sequence, and should not be used concurrently by different threads
to parse separate token sequences.

See driver.py for how to get input tokens by tokenizing a file or
string.

Parsing is complete when addtoken() returns True; the root of the
abstract syntax tree can then be retrieved from the rootnode
instance variable.  When a syntax error occurs, addtoken() raises
the ParseError exception.  There is no error recovery; the parser
cannot be used after a syntax error was reported (but it can be
reinitialized by calling setup()).

Definition at line 27 of file parse.py.


Constructor & Destructor Documentation

def lib2to3.pgen2.parse.Parser.__init__ (   self,
  grammar,
  convert = None 
)
Constructor.

The grammar argument is a grammar.Grammar instance; see the
grammar module for more information.

The parser is not ready yet for parsing; you must call the
setup() method to get it started.

The optional convert argument is a function mapping concrete
syntax tree nodes to abstract syntax tree nodes.  If not
given, no conversion is done and the syntax tree produced is
the concrete syntax tree.  If given, it must be a function of
two arguments, the first being the grammar (a grammar.Grammar
instance), and the second being the concrete syntax tree node
to be converted.  The syntax tree is converted from the bottom
up.

A concrete syntax tree node is a (type, value, context, nodes)
tuple, where type is the node type (a token or symbol number),
value is None for symbols and a string for tokens, context is
None or an opaque value used for error reporting (typically a
(lineno, offset) pair), and nodes is a list of children for
symbols, and None for tokens.

An abstract syntax tree node may be anything; this is entirely
up to the converter function.

Definition at line 57 of file parse.py.

00057 
00058     def __init__(self, grammar, convert=None):
00059         """Constructor.
00060 
00061         The grammar argument is a grammar.Grammar instance; see the
00062         grammar module for more information.
00063 
00064         The parser is not ready yet for parsing; you must call the
00065         setup() method to get it started.
00066 
00067         The optional convert argument is a function mapping concrete
00068         syntax tree nodes to abstract syntax tree nodes.  If not
00069         given, no conversion is done and the syntax tree produced is
00070         the concrete syntax tree.  If given, it must be a function of
00071         two arguments, the first being the grammar (a grammar.Grammar
00072         instance), and the second being the concrete syntax tree node
00073         to be converted.  The syntax tree is converted from the bottom
00074         up.
00075 
00076         A concrete syntax tree node is a (type, value, context, nodes)
00077         tuple, where type is the node type (a token or symbol number),
00078         value is None for symbols and a string for tokens, context is
00079         None or an opaque value used for error reporting (typically a
00080         (lineno, offset) pair), and nodes is a list of children for
00081         symbols, and None for tokens.
00082 
00083         An abstract syntax tree node may be anything; this is entirely
00084         up to the converter function.
00085 
00086         """
00087         self.grammar = grammar
00088         self.convert = convert or (lambda grammar, node: node)

Here is the caller graph for this function:


Member Function Documentation

def lib2to3.pgen2.parse.Parser.addtoken (   self,
  type,
  value,
  context 
)
Add a token; return True iff this is the end of the program.

Definition at line 113 of file parse.py.

00113 
00114     def addtoken(self, type, value, context):
00115         """Add a token; return True iff this is the end of the program."""
00116         # Map from token to label
00117         ilabel = self.classify(type, value, context)
00118         # Loop until the token is shifted; may raise exceptions
00119         while True:
00120             dfa, state, node = self.stack[-1]
00121             states, first = dfa
00122             arcs = states[state]
00123             # Look for a state with this label
00124             for i, newstate in arcs:
00125                 t, v = self.grammar.labels[i]
00126                 if ilabel == i:
00127                     # Look it up in the list of labels
00128                     assert t < 256
00129                     # Shift a token; we're done with it
00130                     self.shift(type, value, newstate, context)
00131                     # Pop while we are in an accept-only state
00132                     state = newstate
00133                     while states[state] == [(0, state)]:
00134                         self.pop()
00135                         if not self.stack:
00136                             # Done parsing!
00137                             return True
00138                         dfa, state, node = self.stack[-1]
00139                         states, first = dfa
00140                     # Done with this token
00141                     return False
00142                 elif t >= 256:
00143                     # See if it's a symbol and if we're in its first set
00144                     itsdfa = self.grammar.dfas[t]
00145                     itsstates, itsfirst = itsdfa
00146                     if ilabel in itsfirst:
00147                         # Push a symbol
00148                         self.push(t, self.grammar.dfas[t], newstate, context)
00149                         break # To continue the outer while loop
00150             else:
00151                 if (0, state) in arcs:
00152                     # An accepting state, pop it and try something else
00153                     self.pop()
00154                     if not self.stack:
00155                         # Done parsing, but another token is input
00156                         raise ParseError("too much input",
00157                                          type, value, context)
00158                 else:
00159                     # No success finding a transition
00160                     raise ParseError("bad input", type, value, context)

Here is the call graph for this function:

def lib2to3.pgen2.parse.Parser.classify (   self,
  type,
  value,
  context 
)
Turn a token into a label.  (Internal)

Definition at line 161 of file parse.py.

00161 
00162     def classify(self, type, value, context):
00163         """Turn a token into a label.  (Internal)"""
00164         if type == token.NAME:
00165             # Keep a listing of all used names
00166             self.used_names.add(value)
00167             # Check for reserved words
00168             ilabel = self.grammar.keywords.get(value)
00169             if ilabel is not None:
00170                 return ilabel
00171         ilabel = self.grammar.tokens.get(type)
00172         if ilabel is None:
00173             raise ParseError("bad token", type, value, context)
00174         return ilabel

Here is the caller graph for this function:

Pop a nonterminal.  (Internal)

Definition at line 191 of file parse.py.

00191 
00192     def pop(self):
00193         """Pop a nonterminal.  (Internal)"""
00194         popdfa, popstate, popnode = self.stack.pop()
00195         newnode = self.convert(self.grammar, popnode)
00196         if newnode is not None:
00197             if self.stack:
00198                 dfa, state, node = self.stack[-1]
00199                 node[-1].append(newnode)
00200             else:
00201                 self.rootnode = newnode
00202                 self.rootnode.used_names = self.used_names

Here is the call graph for this function:

Here is the caller graph for this function:

def lib2to3.pgen2.parse.Parser.push (   self,
  type,
  newdfa,
  newstate,
  context 
)
Push a nonterminal.  (Internal)

Definition at line 184 of file parse.py.

00184 
00185     def push(self, type, newdfa, newstate, context):
00186         """Push a nonterminal.  (Internal)"""
00187         dfa, state, node = self.stack[-1]
00188         newnode = (type, None, context, [])
00189         self.stack[-1] = (dfa, newstate, node)
00190         self.stack.append((newdfa, 0, newnode))

Here is the caller graph for this function:

def lib2to3.pgen2.parse.Parser.setup (   self,
  start = None 
)
Prepare for parsing.

This *must* be called before starting to parse.

The optional argument is an alternative start symbol; it
defaults to the grammar's start symbol.

You can use a Parser instance to parse any number of programs;
each time you call setup() the parser is reset to an initial
state determined by the (implicit or explicit) start symbol.

Definition at line 89 of file parse.py.

00089 
00090     def setup(self, start=None):
00091         """Prepare for parsing.
00092 
00093         This *must* be called before starting to parse.
00094 
00095         The optional argument is an alternative start symbol; it
00096         defaults to the grammar's start symbol.
00097 
00098         You can use a Parser instance to parse any number of programs;
00099         each time you call setup() the parser is reset to an initial
00100         state determined by the (implicit or explicit) start symbol.
00101 
00102         """
00103         if start is None:
00104             start = self.grammar.start
00105         # Each stack entry is a tuple: (dfa, state, node).
00106         # A node is a tuple: (type, value, context, children),
00107         # where children is a list of nodes or None, and context may be None.
00108         newnode = (start, None, None, [])
00109         stackentry = (self.grammar.dfas[start], 0, newnode)
00110         self.stack = [stackentry]
00111         self.rootnode = None
00112         self.used_names = set() # Aliased to self.rootnode.used_names in pop()

Here is the caller graph for this function:

def lib2to3.pgen2.parse.Parser.shift (   self,
  type,
  value,
  newstate,
  context 
)
Shift a token.  (Internal)

Definition at line 175 of file parse.py.

00175 
00176     def shift(self, type, value, newstate, context):
00177         """Shift a token.  (Internal)"""
00178         dfa, state, node = self.stack[-1]
00179         newnode = (type, value, context, None)
00180         newnode = self.convert(self.grammar, newnode)
00181         if newnode is not None:
00182             node[-1].append(newnode)
00183         self.stack[-1] = (dfa, newstate, node)

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 87 of file parse.py.

Definition at line 86 of file parse.py.

Definition at line 107 of file object.h.

struct _typeobject* _object::ob_type [inherited]

Definition at line 108 of file object.h.

Definition at line 110 of file parse.py.

Definition at line 109 of file parse.py.

Definition at line 111 of file parse.py.


The documentation for this class was generated from the following file: